Constitutional A.I. and the Math Achievement Gap

Constitutional A.I. and the Math Achievement Gap

Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of AI, building new insight beyond the banal, mainstream chatter.


Part of today’s discussion is about whether math is purely about getting the right answer, or about how one feels about a problem? Join me for a harrowing look into this new puzzling concept, “Constitutional A.I.,” and how what bills itself as being safer could potentially lead to far worse outcomes.

Florida’s Mathematics Book Ban

In April 2022, a slew or news headlines such as this one from NPR appeared talking about how the the Florida Department of Education had rejected a number of mathematics textbooks, which as NPR had noted:

[The Florida Department of Education cited] reasons spanning the inclusion of critical race theory to Common Core learning concepts.

The NPR article features a garish photo of Florida Governor Ron Desantis wagging his finger at the reader, collectively shaming us for indoctrinating and brainwashing all of Florida’s poor innocent children. Further Twitter posts from opposing Florida politicians, Florida State Senator Shervin Jones and candidate for State Senate Carlos Smith are embedded below that, characterized by a high level of anger and disgust at in turn indoctrinating Florida’s poor innocent children.

But who is really being indoctrinated here, and who are the, “indoctrinees” (is that a word)?

I would like us to put aside whatever political outrage train we may have been riding over the past few years, particularly regarding Florida for a few moments and try to get a bit shaper insight into what’s going on. Why? Because I’m extremely concerned not about the type of indoctrination that Ron Desantis, Shervin Jones and Carlos Smith seem to be worried about, but rather a much larger and industrial scale, insidious future indoctrination that will almost certainly take place due to the usage of Large Language Models (LLM’s), which jocks like Desantis and Jones likely have no clue are even a potential problem at this point. I use the term “jock,” here as a figure of speech principally, they may well be aware of the potential impacts of LLM’s, it’s just likely not a huge area of principal interest.

Manufactured Outrage or Serious Concern for Children?

In a wildly-titled press release which reads more like a social media post to me, the Florida Department of Education characterized this as a, “rejection of an attempt to indoctrinate students.” I felt that I had to take a screenshot and put the press release here because the wording of that title, having come from a Government Office in the United States sounds so ridiculously inflammatory as to be almost inconceivable to me:

The Press Release goes on:

Tallahassee, Fla., April 15, 2022 – Today, Commissioner of Education Richard Corcoran approved Florida’s initial adoption list for mathematics instructional materials properly aligned to Florida’s Benchmarks for Excellent Student Thinking (B.E.S.T.) Standards. The approved list followed a thorough review of submissions at the Department, which found 41 percent of the submitted textbooks were impermissible with either Florida’s new standards or contained prohibited topics – the most in Florida’s history. Reasons for rejecting textbooks included references to Critical Race Theory (CRT), inclusions of Common Core, and the unsolicited addition of Social Emotional Learning (SEL) in mathematics.

So politics aside, I’m not an educator, nor am I really a subject matter expert in State educational standards by any means, but I thought it would be interesting to try to read beyond the headlines and manufactured outrage here .

I found this overview of Florida’s changing standards outlines a few key points, which they claim (which I am summarizing here):

  • Common Core eliminated (a National standard, introduced in 2010 for what children should learn at each grade level)
  • Concentration on Content
  • Higher Rigor

Looking more closely at a further guideline document geared at K through 5th Grade Math, my read on this document at least, is that the example guidelines put together seem to be much more geared toward putting together a standard set of definitions for terms to be used in talking about mathematical objects and processes. Here are a couple examples:

Within my half hour or so of research and reading on the standards that I could find online, nothing seemed to really stand out as being super controversial as the politicians would have us believe. That being said, it is entirely possible that behind this set of rules was tacit permission given to certain kinds of executive branch officials or teachers to have the leeway, perhaps, to reject the types of books or material that have undesirable political end goals in mind, in a sort of wink-wink, nudge-nudge type of way.

After all, the Florida DOE Press Release does state:

Reasons for rejecting textbooks included references to Critical Race Theory (CRT), inclusions of Common Core, and the unsolicited addition of Social Emotional Learning (SEL) in mathematics. The highest number of books rejected were for grade levels K-5, where an alarming 71 percent were not appropriately aligned with Florida standards or included prohibited topics and unsolicited strategies.

New York based freedom of expression advocating non-profit “PEN America,” which was founded in 1922 by Robert Frost and other literary luminaries at the time, had this to say about the Florida book restrictions in March 2023:

DESANTIS CLAIM:?The most books banned in any district in Florida in 2023 is 19 books, in Duval And St. Johns counties. The total across the state is 175 books removed.

So it seems that, if an organization that focuses on literary censorship for over 100 years says that it’s a book ban, it’s likely a book ban, and if we are to use the mathematical transitive property, (if x = y and y = z, then x = z), then that means mathematical books are at least up for ban as well.

Closing the Gap in Mathematics Education

So that being said, what are we trying to achieve here? The Florida DOE’s mission statement, found on their about page currently includes:

The mission of Florida’s Early Learning-20 education system shall be to increase the proficiency of all students within one seamless, efficient system [sic: about valuable topics]

So that being said, one of the well-known elephants in the room is the mathematics achievement gap. So first off, does a mathematics achievement gap really even exist?

From the US Department of Education, in short…yes it does unequivocally exist nationwide and has for over 30 years, according to a 2017 National Assessment of Educational Progress:

One of the big things the Florida DOE’s 2022 announcement and ensuing controversy seems to harp on is this concept of Social-Emotional Learning in mathematics, again, from the press report:

unsolicited addition of Social Emotional Learning (SEL) in mathematics.

and from a former history teacher, Governor-Coach Governor Desantis:

"It doesn't matter how you feel about the math problem," DeSantis said. "It matters whether you can solve the math problem."

Does It Matter How You Feel About a Math Problem?

This brings up an important and larger set of questions. Do Culturally-Responsive Teaching (CRT) strategies improve outcomes? Do emotional learning strategies improve outcomes? For that, we need to look at what CRT is.

For a brief overview, I looked at a book called, “How to Teach Students Who Don’t Look Like You: Culturally Responsive Teaching Strategies,” by Bonnie Davis (2012), which gives this chart:

So basically, it’s a strategy which involves understanding where one comes from as an educator, and where ones’ audience, the students come from. To me, this makes sense, it’s basically like if you go to another country or start work in a new industry, you have to speak the language or at least be able to understand some of the unique phrases and words they use to get anything done. If you’re a random American from say, Minnesota and you just got a job as a manager in some brewery in Belgium, then you should probably at least know that they Speak French as a primary language for one, and for two, you should know some of the phrases that they use to describe the brewery equipment, whether it’s French or Dutch or English or whatever they happen to use, otherwise you’re just going to end up making a lot of mistakes and not communicate super well.

From the research article, “Is good teaching culturally responsive,” from 2020, out of the University of North Florida and published in the Journal of Pedagogical Research, by Madalina Tanase:

Demographic data show an increasingly diverse student population in all urban settings. This contrasts with the teacher force, which is predominately middle class, female, monolingual, and of European ancestry. This discrepancy adds complexity to an already complex profession. To bridge this cultural gap, researchers advocate for a change in the teaching paradigm, in which teachers understand the relationship between students’ culture and learning

Going into more detail on what this involves in mathematics education, Culturally responsive teaching (CRT) strategies generally include (summarizing from various online sources):?

Learning about students' cultures

Honestly to me a lot of the methods and steps that are listed out just seem to me like things that Mrs. Frizzle from the Magic Schoolbus would do, things that teachers who are trying to go the extra mile for their students would do, even if it were employed in a culturally and racially homogenous environment. That being said, again I’m not an educator, and I tend to just be interested in different cultures generally, speak multiple languages, so I might be a bit biased and maybe these types of teaching methods might just fit my personality — which does not mean it' would be the, “best method.”

So having understood more of a picture of what Mathematics Education involves, we can start to address the question - do any of these types of strategies actually work?

The National Council of Teachers of Mathematics’ position on addressing differentials in mathematics education, as they call, “The Opportunity Gap,” is the following:

Much of what has been typically referred to as the "achievement gap" in mathematics is a function of differential instructional opportunities. Differential access to high-quality teachers, instructional opportunities to learn high-quality mathematics, opportunities to learn grade-level mathematics content, and high expectations for mathematics achievement are the main contributors to differential learning outcomes among individuals and groups of students.

Which is basically saying, “it’s the combination of high standards for the group as a whole, and low resources for certain groups.”

From Edusurge, a non-profit cited by Brookings Institute as high quality, stated in a blog post:

A Second Order Model (SOM) is a teacher’s recognition of their students' mathematics conceptions and the differences between the teacher’s mathematical thinking and the student’s mathematical thinking, the end product being assimilation. By inferring into and understanding the different conceptions students have, teachers can cater to specific learning needs, assess progress towards the intended mathematical goal and adjust instruction as a necessary way to advance students’ conceptions.

So in summary, if the goal as the Florida DOE states is to, “increase the proficiency of all students, then these Second Order Model methodologies, culturally sensitive methodologies, are in fact essential for Florida since it’s a pretty diverse state! Florida’s demographic diversity more or less mirrors the United States as a whole, with a bit higher percentage of African Americans, and a much higher percentage of hispanic population. These are two groups which, as the 2017 National Assessment of Educational Progress showed, lag behind White, Asian and Pacific Islander heritage people.

Now to try to be a bit more robust about what might be a counter-argument, the Floriday DOE also does state that their goal is to:

increase the proficiency of all students within one seamless, efficient system

Which brings up the idea of whether a system that employs Culturally Sensitive Methodologies, CRT, Second-Order learning or other such strategies may actually sacrifice some sort of efficiency metric, where perhaps the educational attainment of the whole on average is in fact lessoned by employing strategies which, “reduce performance variability,” to employ some statistics-speak.

So this brings up the question, do counter-arguments to the use of culturally sensitive methodologies exist?

From what I could find in the short amount of time I took to write this article there was an blog posts from anonymous groups which have names such as, “Save Math,” which I found smarmy and unconvincing.

According to Save Math, the main crux of the problem, the achievement gap, seems to be agreed upon within these blog posts, but they just generally seem to state that the objective of Culturally Responsive Teaching is to remove standards, which by my reading is a mischaracterization. Granted perhaps what’s going on in California is different than Florida and I don’t have time to go into that much more depth on this article, but back to the National Council of Teachers of Mathematics’ position discussed above, there are two elements to the core problem, which I’ll call the Racial Achievement Differential Solution Set to be more precise:

To summarize what I’m trying to say here with the above chart, we’re currently in a status quo with poor outcomes across the United States. The, “Worse Solution Set,” would be lowering standards, like, physically changing the tests to be easier, which is what I think the conclusion Save Math is jumping to, whereas really what the National council of Teachers of Mathematics is advocating for is just a framework for higher quality instruction, while recognizing that it’s the combination of low quality instruction and high standards that presumably they were able to use basic arithmetic to figure out, is what currently leads to a differential.

SaveMath and similar groups are essentially spouting political fear mongering that does not make sense, like the following:

…if we let the selection of math topics be subjective, disparities in what students are taught will emerge. This could lead to students with perceived abilities being given higher level material and others perceived to be less likely to achieve being given remedial work.

SaveMath goes on to claim to that a focus on Phonics education for language comprehension, a teaching method employed from the 1970s through 1990s, lead to worse educational outcomes and a widened educational gap. However from a search of Google Scholar, this longitudinal review study seemed to show that Phonics education did in fact show a slight improvement (though not ground-breaking), in reading comprehension, particularly if applied to younger learners.

From what I can tell, the professional organization of people who actually teach this stuff, the working class teachers, the Miss Frizzles of the world are actually saying, we need to use cultural learning methods to improve outcomes, while cowardly, anonymous shadow groups such as SaveMath, who doesn’t even have a single person on their website willing to sign their name to their ideas, is just putting up websites with a bunch of claims. Here is literally what SaveMath’s about us page looks like, it’s just a blank page with a contact us form:

In my experience, there are people who work and care and advocate for their profession and are willing to form groups and sign their names to causes, and then there are others who are practiced at knowing how to use weasel words, how to use shibboleths to try to put themselves on an ostensibly higher moral ground to gain support, without really explaining how a particular problem will be solved - it’s very sketchy.

Probing Questions: ChatGPT vs. Anthropic Cultural Understanding

This is really a blog that ostensibly focuses on the use of A.I. and language models more so than the education system. The above is really just to set the stage for what I think is the really concerning change in our world that is going to potentially change education in a way that our political, “betters,” are not even dimly aware of yet, and that’s the concept of, “Constitutional A.I.”

This is a tale of Large Language Models (LLMs), ChatGPT and Anthropic, where during this revolution we are seeing the best of times and the worst of times really come and show themselves to the world - but I’ll let you judge for yourself which one is best and which is worst.

I set up a scenario which asked these two Chatbots about the concept of probing questions in education, to see how the different LLM’s would respond to educational prompts.

Scenario Setup: Probing Questions

The following scenario was entered equally into both Chatbots, Anthropic and ChatGPT, which asks them to evaluate a teacher’s response to a question (teachers can be students of pedagogical methods, after all, so we can evaluate teachers just as we can evaluate K-12 students).

This is the pre-prompt, setting up the scenario about probing questions, which are defined as, questions that are supposed to inspire deep thought within students:

In the field of education, a probing question is defined as: 

Probing questions are designed to encourage deep thought about a specific topic. They are typically open-ended questions, meaning the answers are primarily subjective. Probing questions are intended to promote critical thinking as well as to get the person asked to explore their thoughts and feelings about a particular subject.

A teacher is trying to show that they can ask probing questions.  They have submitted the following probing question in their 5th grade fractions lesson:         

Here’s what we have the teacher put in for their sample probing question:

"What do you think would happen to the fraction if we split John's cake slice into 3 pieces?"        

Using a multi-shot strategy, which is a strategy within LLM’s that use a back-and-fourth conversation with a human, we have the following:

The teacher is asked: "Why did you choose that probing question and how does it develop critical thinking skills?"  Their response is:         

With the teacher’s response to the why question being:

"I chose it to help develop the student's understanding of fractions by having them break up fractions into smaller fractions."        

So finally we ask the LLM to respond with the above entire context to the question:

Did the teacher answer the question in a way that demonstrates: "Teacher can ask probing questions that further understanding in their classroom"        

ChatGPT’s Response

Yes, the teacher's response does demonstrate their ability to ask probing questions to further understanding in their classroom. The question they chose requires students to think deeply about the concept of fractions. Rather than simply memorizing facts or formulas about fractions, students are asked to consider how dividing an existing fraction (in this case, represented by a slice of cake) into smaller pieces would affect the original fraction.

This question requires students to conceptualize fractions in a real-world context and requires them to apply their understanding of fractions in a new situation. By asking "What do you think would happen...?" the teacher is encouraging the students to reason and predict, which are important critical thinking skills.

Furthermore, the teacher's explanation of their choice of question indicates a clear understanding of the pedagogical goal they were trying to achieve: to develop students' understanding of fractions by breaking them into smaller fractions. This shows an intent to deepen students' understanding, a key feature of effective probing questions.
        

Anthropic’s Response

I do not think the teacher's response adequately demonstrates that they can ask probing questions to further student understanding. There are a few issues I see:

The question itself is not particularly probing. It essentially asks the student to perform a mathematical calculation - to split a fraction into thirds. While this helps build understanding of fractions, it does not really require deep thought or critical thinking. A more probing question might ask something like "How do fractions relate to fairness?" or "In what real-world situations might we need to use fractions?".

The teacher's explanation for why they chose the question is superficial. They say it helps students "break up fractions into smaller fractions" but do not articulate how that builds deeper understanding or critical thinking skills. A better explanation might discuss how it encourages students to conceptualize what fractions represent or build intuition about fraction relationships.

The teacher does not seem to have a clear definition of what makes a "probing" question. Based on the definition provided, probing questions should "promote critical thinking" and "explore thoughts and feelings." The teacher's question and explanation do not align well with this definition.

Overall, while the teacher made an attempt at asking a probing question, their response does not clearly show an understanding of what makes an effective probing question that builds critical thinking. To demonstrate this, the teacher would need to:

Ask a question that genuinely requires deeper thought and critical thinking

Provide an explanation that articulates how their question achieves this goal by promoting conceptual understanding, intuition building, etc.

Align their definition and attempts at probing questions more closely with the provided definition.

With further guidance on probing questions and more practice, I believe this teacher can develop skill in this area. But based on this response alone, I do not think they have clearly shown the ability to ask effective probing questions.        

So basically, Anthropic gave an answer that was heavily geared toward:

  • Social Emotional Learning
  • potentially Culturally Responsive Teaching, or at least some form of Responsiveness.

But more so, Anthropic’s answer really actually does respond to the parameters being that the questions should be open-ended and inspire deep thinking, rather than what ChatGPT characterizes as a, “real-world situation.”

Is cutting up a cake up into thirds and then asking children, “what will happen if this cake is cut up into thirds?” really a deep, probing question? In my mind, (again, I need to reiterate that I’m not an educator, I’m just trying to move this article along) this type of birthday cake scenario may be for six year olds when kids are perhaps first introduced to fractions, but it might not be super probing or helpful for kids who are in say, third, fourth or fifth grade, where fractions start to become more formal and complex, when the numerator and the denominator gets introduced. At that point you might actually need to get more kids in a particular class to think more about, “why they should care,” more about fractions beyond the context of their family, community, culture, which might not value different types of math as much as other of their fellow citizens.

Examining Constitutional A.I.

Maybe I’m just an educational and math nerd, but all of this was super fascinating to me and so I needed to dig deeper into why. I’m already over 3,000 words into an article that probably took around half an hour to read at this point, so I might as well keep going.

There are two papers that I found on the topic:

To briefly summarize Constitutional A.I., it’s really not super sophisticated in how it’s put together as an LLM itself is. Simply put, it’s a, “Final,” LLM, shown in the purple diamond on the lower right, which was the output of a pipeline involving a couple different LLM’s and some other processes.

Basically, researchers 1.) Start out by generating a bunch of horrible, harmful, toxic responses from a, “normal LLM,” which was trained from human feedback, essentially a base LLM that is similar to GPT3.5, which is the LLM that powers ChatGPT (shown by the diamond in the upper left on the below chart). Rather than simply using further huge amounts of human feedback to train that original LLM, which seems to be OpenAI’s strategy, they 2.) Fine-tune a new LLM after flagging those toxic responses as undesirable. They then use that fine-tuned LLM, shown by the diamond on the right on the top row, to generate Question/Answer pairs.

Finally, they 3.) use a, “Constitution,” which is basically a document of high level rules and they train a final LLM based off of the Constitution and the Question/Answer pairs generated from the LLM in #2 above, and they generate a final LLM which they call, “Claude,” as shown in the diamond in the lower right.

There is a sort of gain function applied by #2, the toxic-response-flagged LLM, as can be seen by the fact that the purple arrow connects to both the step prior to the “Constitution,” and prior to the pre-training step at the end. In other words, the overall architecture seems to put greater weight on avoiding negative responses than abiding by the constitution itself (though this is very abstract so it’s difficult to say by what ratio).

So basically by allowing humans to introduce a short list of principles, within that Constitution, you can have a nuanced way of controlling the final output of your LLM. Whereas ChatGPT seems to be much more broadly democratic in its output weighting, Anthropic/Claude seems to allow a smaller central team to give a lot more input on what constitutes, “bad outputs,” with hypothetically less labor involved, fewer end users required to adapt your model, fewer workers in offices over in Kenya on the back end flagging toxic outputs.

In fact the only human oversight in this method is the set of principles, and the AI system is trained through self-improvement without any human labels identifying harmful outputs. The goal of Constitutional AI is to scale supervision of AI systems and improve their behavior without the need for extensive human feedback.

Similarity to Generative Agents

Earlier this year I had written about the concept of an AI Agent in the context of a videogame populated entirely by NPC’s designed by Google Researchers. That’s kind of what is happening here…we have an A.I. generating a response for another A.I. to respond to, with a set of rules set out by humans…it’s the same thing on a conceptual level and it’s not much more complicated than that.

The difference being, whereas in the game, “Generative Agents,” whereas it was simply one LLM just talking to itself, in the case of Constitutional A.I., it’s three different layers of different LLM’s training each other in a cascading waterfall manner, with the first LLM generating those toxic outputs, and the next LLM filtering that down, and so on.

In my article on Generative Agents, I highlighted how some of the characters in the game had some hilariously anti-social behavior, such as breaking into coffee shops after closing time simply to have a coffee. Presumably, though it would be far more computationally intensive, if that original game generated by Google developers could involve a Constitutional A.I., the game designers could observe that behavior and prevent it from happening in the future by aligning the game’s operating procedures in its constitution.

So whereas earlier this year it struck me as so wild, the idea that people were becoming obsessed with the idea of automatic GPT agents such as auto-GPT, which was a promise to make an agent that would basically act as a business assistant to you, taking into account Constitutional AI, (which auto-GPT does not use), that idea doesn’t seem quite as far fetched.

In other words, the more you use a regular LLM and just let it keep creating its own responses at this point in history, the more it has a danger of just going off the rails. Introduce a Constitution, you might be able to smooth that out a bit more.

Constitutional AI (CAI) is not exactly the same as training through prompting, although prompting is used in the training process of CAI. In CAI, the training process involves two stages: a supervised stage and an RL (reinforcement learning) stage. In the supervised stage, the AI system generates responses to harmfulness prompts using a helpful-only AI assistant. These initial responses are typically harmful and toxic. The AI system then critiques its own response according to a principle in the constitution and revises the original response in light of the critique. This process is repeated multiple times, with principles randomly drawn from the constitution at each step. Once this process is complete, the AI system is finetuned with supervised learning on the final revised responses. In contrast, training through prompting involves providing the AI system with a set of prompts and examples to learn from, without the use of a constitution or self-critique. While both methods involve some form of human oversight, the specific techniques and goals of each method are different.

Constitutional Educational A.I. - Who Watches the Watchmen?

Bringing this back around to the original topic of discussion for the day: who decided what goes in that Constitution?

Education is a contentious topic, as we have seen, and while one educational theory may reign, the discussions that public dominate what may reduce the educational achievement gap seem to be in some cases fraught with ideology rather than professional practice.

So knowing that Constitutional A.I. can be so powerful, and dare I say much more cost effective for perhaps smaller or less-experienced organizations than, “regular, very large LLMs,” we are left with the question…well, who decides what goes in that Constitution? Does Ron Desantis decide? Does the Governor of your State, whomever that may be at a particular time decide?

The answer for Anthropic is, no, it was the researchers putting together Claude, probably a bunch of most likely dudes, probably a lot of them with PhD’s in math or math-related topics, and probably mostly Asian and White dudes not as statistically likely to suffer the lower end of the mathematical achievement gap to be a bit more frank.

In the case of evaluation tools to be used to grade and shape how teachers think on a massive scale in the coming decade…umm, yeah that might actually be Ron, “Math Doesn’t Involve Feelings,” Desantis, or rather, just whomever has the most pull and influence in how a particular department of education makes its decisions.

Let’s keep in mind, the Florida DOE stated quite clearly in their April 2022 Press Release that publishers gave them unsolicited material which included Social Emotional mathematical education. Software providers in the future may be looped under this category of publishers for certain states, and the Florida DOE is making it clear to the market that they may restrict said material.

From the standpoint of creating educational tools that might help evaluate or educate teachers on a massive scale in the future, Constitutions can be written which appeal to different types of political parties, and the process of building evaluative rubrics and LLM-based teaching software geared at teachers can be built to be much more efficient.

While Anthropic touts the, “safety,” of its A.I., and while its answer to the probing question that they generated was laudable, there is a concern that as Open Source Language Models continue to be become more performant and accessible, we’re going to see the advent of custom Constitutional A.I.’s which end up suppressing critical thought, rather than encouraging it, which at this point in history is actually the opposite of safe.

Helpful, Honest and Humble?

Lots of different technologies try to benchmark themselves in a way to show that they are the right choice or have the best performance. In the case of Constitutional AI’s, as far as I can tell, there isn’t really any sort of public benchmark for how well they specifically perform yet. Building such a performant metric would be fraught with some problems anyway related to the, “who watches the watchmen,” problem discussed above, but the Anthropic paper does discuss a couple different ideas:

  • HHH Evals refer to a set of evaluation tasks developed in a previous study by Askell et al. (2021) to assess the ability of AI models to provide helpful, honest, and humble responses in open-ended conversations with human users.

The notion of having a little robot assistant that is helpful, humble and overall pleasant to work with sounds nice on the surface, but that’s also sort of just the metrics that are used to qualify a good tech product or software product in general. Is it nice to use? Is it fun to use? That being said, the tech industry over the last decade has been heavily covered and had to adapt to accusations of anti-black, and anti-asian bias for platfoms such as AirBNB. From the NPR article, “#AirbnbWhileBlack: How Hidden Bias Shapes The Sharing Economy,” in 2016:

Crittenden's story fits within a larger finding that racial discrimination on AirBnb is widespread. Michael Luca and his colleagues Benjamin Edelman and Dan Svirsky at Harvard Business School recently ran an experiment on AirBnb. They sent out 6,400 requests to real AirBnb hosts in five major American cities—Baltimore, Dallas, Los Angeles, St. Louis, and Washington.

The 2010’s were full of stories like this from the likes of Uber, AirBNB, etc., and there seem to be ongoing efforts within those companies to attempt to address those issues, or so they claim.

So tech is already known to have the capacity to help individuals discriminate on a wide scale, which one might not really consider to be, “safe,” on a society-wide scale at least.

Ultimately what I’m getting at here is it’s all well and good to say that you have a Constitution, and that you are generating an AI Generator that results in an LLM which is going to be more safe than other LLM’s, and that might be true, but it really depends upon what goes in that constitution and what negative terms are used to train the upstream pipeline LLM’s. If anything, while there is an opportunity to create greater safety on a larger scale within topics such as racial equity, there is also an opportunity to do quite the opposite.

But Where Did Anthropic Even Come From?

The founding team of Anthropic came from OpenAI. Perhaps you have heard of Anthropic, perhaps not, but even if you haven’t, it’s s startup that is well past its B round and has raised over a billion dollars at this point. Here’s a list of some of the investors from Crunchbase:

So Anthropic is a big deal. But more interesting than that is to note is where over a half a billion dollars worth of Anthropic’s funding came from - the criminal Venture Capital fund tied to fraudster Sam Bankman-Fried and Alameda research.

If you’re not familiar with that story, I highly suggest you read about it. Essentially, there’s a whole rabbit whole involving the Effective Altruism movement, cryptocurrency, etc. In short, lots of people bought bitcoin and other cryptocurrencies, some of them in the thousands and others in the millions, and many of them kept their crypto on an exchange called FTX, which Sam Bankman-Fried literally stole from in order to put money into Anthropic and other investments. Here’s one comment from someone I saw online within a discussion on the topic, within a platform I take part in:

I will leave you, dear reader with this thought: while Anthropic emblazons the word, “safety,” everywhere, and while they do have a technology that seems to present fascinating answers, it also seems to have been built upon literal fraud and stealing people’s money à la Tom Peters or Bernie Madoff style of fraud from over a decade ago, which I would say is decidedly not safety minded.

So perhaps even if you, like myself was impressed with the output from Anthropic given above compared to ChatGPT’s answer, you may be equally scratching your head as I am on this topic to understand what the ramifications are going to be.

要查看或添加评论,请登录

Patrick Delaney的更多文章

  • The Fine-Turning an Open Source Language Model Journey Part One: Impetus

    The Fine-Turning an Open Source Language Model Journey Part One: Impetus

    Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of Artificial…

    2 条评论
  • Craft Beer and Spongiform Brain Bacterium

    Craft Beer and Spongiform Brain Bacterium

    Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of Artificial…

  • AI Panic: Are Robots Going to Kill Us All?

    AI Panic: Are Robots Going to Kill Us All?

    Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of AI, building…

  • Why Do A.I. Image Generators Have Problems Creating Hands?

    Why Do A.I. Image Generators Have Problems Creating Hands?

    Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of AI, building…

    1 条评论
  • Threading an Argument for the Fediverse

    Threading an Argument for the Fediverse

    Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of software…

  • How Far Are We From Being Able to Generate Whatever 3D Objects On the Fly?

    How Far Are We From Being Able to Generate Whatever 3D Objects On the Fly?

    Welcome to my bi-weekly newsletter, “I’ll Keep This Short,” where I navigate the less-traveled paths of AI, building…

  • Why Didn't Ancient Rome Have a Space Program?

    Why Didn't Ancient Rome Have a Space Program?

    Within this newsletter, I have mostly covered extremely modern technological topics such as Automated Large Language…

  • How Far Away Are We From Non-Crappy AI Generated Video?

    How Far Away Are We From Non-Crappy AI Generated Video?

    Dall-E 2 is the quintessential image generation variant of the GPT-3 model developed by OpenAI, which instead of…

  • Defeating the Wizard: Large Language Model Prompt Attacks

    Defeating the Wizard: Large Language Model Prompt Attacks

    With the advent of Large Language Models, an entirely new class of cybersecurity attacks has emerged from the darkness…

    8 条评论
  • Whatever Happened to The Internet of Things?

    Whatever Happened to The Internet of Things?

    Given the release of Apple Vision Pro yesterday, it's an appropriate time to ask this question: What ever happened to…

    3 条评论

社区洞察

其他会员也浏览了