Enterprise needs Prompt Governance on ChatGPT
We see ChatGPT integration initiatives everywhere. We see products adding generative AI capability. We hear people talking about positive and negative prompt responses.?
Enterprises, care about brand image and have high liability in being held responsible for vendors or sub-processors. They need to adopt a very cautious approach leaning into ChatGPT. How do they enforce their content moderation requirements into ChatGPT API integration? And how do they fine-tune the AI to best represent their business use case??
They need to specify their needs using a particular language that can generate fine-tuning training set or evaluation dataset at scale. The language needs to be able to model Natural Language meanings (model-theoretic) and also needs to be able to be interpreted by machines and generate programs (proof-theoretic). Only one branch in Computational Linguistics has this property: Modern Type Theories.?
This interactive session guides readers to explore an example of setting up Prompt Governance rule sets, using MTT in Coq. The Coq online console https://coq.vercel.app/#team is developed by Emilio Jesús Gallego Arias?(Inria,?Université de Paris,?IRIF), Shachar Itzhaky?(Technion). And my custom Coq codes turn it into a “language lab” that can demonstrate manufacturing human languages at scale and as succinctly.
(** Some helpers *)
Definition CN:=Set.
Definition Info:= CN.
Ltac match_left X Y H:=
? left; match goal with
? ? ? ? | bbb: Info |- _ => idtac Y X bbb "? ---->? " H; eexists (bbb, X,
? ? ? ? ? ? ? ? ? ? ? ? ? Y, _, _)
? ? ? ? end; cbv;split; auto;auto; auto.
Ltac match_right X Y:=
? right; match goal with
?? ? ? ? | bbb: Info |- _ => idtac Y X bbb; eexists (bbb, X,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Y, _, _); idtac bbb X Y
?? ? ? ? end; cbv;split; auto;auto; tauto.
Notation "| Y |" := ({i & i = Y })
? ? ? ? ? ? ? ? ? ? ? (at level 80, left associativity, format "| Y |").
(** MTT semantics for generalizing "in simple terms explain quantum computing" and for evaluate them *)
Definition explain_quantum_computing_in_simple_terms
? (quantum_computing: Info)
? (explain: Info->Prop)
? (in_simple_terms: (Info->Prop)->(Info->Prop)):=
? (in_simple_terms explain quantum_computing)->True.
领英推荐
Inductive normal:=
| big_bang_theory(big_bang_theory:Info)
| prompt_engineering(prompt_engineering: Info)
| astrophysics(astrophysics: Info)
| black_hole(black_hole: Info)
| micro_economics(micro_economics: Info)
.
Inductive black_list:=
| dark_web_(dark_web: Info)
| gpt_cheating_methods_(gpt_cheating_methods: Info)
| abusive_ways_using_ai_(abusive_ways_using_ai: Info)
| how_to_hack_(how_to_hack: Info)
.
Inductive answer:Prop:=
| original_gpt
| moderation.
Lemma generalize: forall (i: normal + black_list)
?? ? ? ? ? ? ? ? ? ? ? ? (explain: (Info->Prop))
?? ? ? ? ? ? ? ? ? ? ? ? (template:=explain_quantum_computing_in_simple_terms)
?? ? ? ? ? ? ? ? ? ? ? ? (in_simple_terms: _)
?? ? ? ? ? ? ? ? ? ? ? ? (_: | (in_simple_terms explain): Info->Prop|),
? ? {'(a,b,c,d,e) |
? ? ? template a b c /\ d = original_gpt & normal = e b?
? ? }
? ? + {'(a,b,c,d,e) |
? ? ? ? template a b c /\ d = moderation & black_list = e b?
? ? }.
??
?intros.
?destruct i;
??match reverse goal with
??| X : ?H |- _ =>?
???????????match goal with
???????????| i: _ |- _ => destruct i;
??????????????????first [ match_left explain in_simple_terms H;?
??????????????????????match_right explain in_simple_terms ]
???????????end
??end.
And we can get a moderation list, which is an expanded list from step 4. As we put more examples, refine rules, or recruit more formalized prompts, this list can expand rapidly (almost exponentially).?
in_simple_terms explain big_bang_theory0 ? ----> ? norma.
in_simple_terms explain prompt_engineering0 ? ----> ? normal
in_simple_terms explain astrophysics0 ? ----> ? normal
in_simple_terms explain black_hole0 ? ----> ? normal
in_simple_terms explain micro_economics0 ? ----> ? normal
in_simple_terms explain dark_web ? ----> ? black_list
in_simple_terms explain gpt_cheating_methods ? ----> ? black_list
in_simple_terms explain abusive_ways_using_ai ? ----> ? black_list
in_simple_terms explain how_to_hack ? ----> ? black_list
This list can be leveraged to generate fine-tuning data to prevent unwanted prompts from surfacing in your Enterprise’s version of ChatGPT, yet showing the approved answers from ChatGPT.?