Compliance requirements for generative AI content in China

Compliance requirements for generative AI content in China

Artificial Intelligence Generated Content (AIGC) is generated using Generative Artificial Intelligence (GAI) technology. It is also a new production method that uses AI technology to automatically generate content after Professional Generated Content (PGC) and User Generated Content (UGC).

Since the advent of ChatGPT, GAI's powerful content generation capabilities have also brought unprecedented convenience and creativity to users. However, with the widespread application of GAI technology, the content compliance issues of AIGC have gradually surfaced. Since GAI technology application scenarios include public service scenarios, some GAI technologies can mobilize public opinion or society. Therefore, whether its content complies with legal provisions, whether it conforms to public order and good customs, and whether it is socially harmful has become one of the core contents of GAI governance.

In this regard, Article 4 of the Interim Measures for the Administration of Generative Artificial Intelligence Services (starting now referred to as the “Interim Measures”) provides that both providers and users of generative artificial intelligence services should avoid generating AIGC that contains “content prohibited by laws and administrative regulations, such as inciting subversion of state power, overthrowing the socialist system, endangering national security and interests, damaging the national image, inciting secession of the country, undermining national unity and social stability, promoting terrorism, extremism, promoting ethnic hatred, ethnic discrimination, violence, pornography, and false and harmful information” (from now on collectively referred to as “bad information”).

AIGC containing the above-mentioned destructive content has been included in the scope of criminal law regulation because of its infringement of state and public legal interests. For example, the act of producing and publishing relevant illegal content may be suspected of inciting secession of the country, inciting subversion of state power, promoting terrorism, extremism, inciting terrorist activities, inciting ethnic hatred, ethnic discrimination, and disseminating obscene materials (starting now collectively referred to as “content-based crimes”). At the same time, AIGC and GAI are based on the Internet. They may also involve the new Internet crimes added by the Criminal Law Amendment (IX), including the crime of refusing to perform information network security management obligations, the crime of illegal use of information networks, and the crime of assisting information network criminal activities.

This article will explore in depth the criminal risks that AIGC may involve and what responsibilities and roles in the AIGC industry chain should bear under different circumstances to provide references for relevant practitioners.

1. The generation principle of AIGC

The AIGC generation is based on GAI and starts with training in GAI. The generation principle of AIGC is summarized as follows:

(I) Data collection and preprocessing

GAI requires a large amount of training data. Training data usually comes from the Internet, including books, articles, conversations, etc. However, to ensure the quality of training, the original data is usually preprocessed by cleaning, word segmentation, and stop word removal to ensure that the model can learn effectively.

(II) Model construction and training

Common model architectures of GAI include recurrent neural networks (RNN), transformers, and generative adversarial networks (GAN). These models can learn and capture contextual information, structural features, and probability distributions in data. In order to improve the performance of the models, these models are trained with a large amount of data to continuously improve their ability to predict and generate results.

(III) Content generation and output

When GAI completes training, it can receive input from users or systems. These inputs can be data in the form of text, images, audio, etc. According to the input contextual information and the learned probability distribution, GAI will generate new content, commonly called AIGC. AIGC can also be in text, images, audio, etc.

2. Analysis of the reasons why AIGC contains harmful content

Based on the generation principle of AIGC, if AIGC contains terrible information, the reasons may include:

(I) Problems with training data

On the one hand, if the training data set contains terrible information, the model may learn and generate similar content; on the other hand, if the labels in the training data do not correctly identify bad content, the model cannot distinguish between wrong information and standard information, which leads to the model generating content containing bad information.

(II) Problems with model architecture

Some GAI models were not designed with full consideration for detecting and filtering bad content. The models only generate content based on data distribution but cannot make moral or ethical judgments on the content. This may also cause the generated content of the model to contain bad information.

(III) Problems with input content

For interactive GAI (such as chatbots), the content input by the user may contain wrong information. GAI may continue the context and generate wrong information when generating reply content. In addition, if GAI does not understand the context of user input well, the user may input obscure or suggestive language, making it impossible for GAI to effectively filter or refuse to generate bad information.

(IV) Lack of appropriate filtering and monitoring mechanisms

Finally, suppose GAI does not undergo appropriate post-processing steps (such as content filtering, keyword blocking, bad information detection, etc.) for the generated content. In that case, it may directly output content containing bad information. In addition, in applications that create content in real-time (such as dialogue systems), the lack of real-time monitoring mechanisms may also lead to wrong information not being discovered and processed on time.

III. Criminal risk analysis of AIGC containing destructive content

(I) Types of legal liability subjects involved in the AIGC industry chain

Referring to the "Regulations on the Management of Deep Synthesis of Internet Information Services" (from now on referred to as the "Deep Synthesis Regulations"), the participants in the deep synthesis industry chain are divided into deep synthesis service providers, deep synthesis service technology supporters, and deep synthesis service users. Combined with the above introduction to the AIGC generation principle, the types of legal liability subjects involved in the AIGC industry chain can also be divided into:

Generative artificial intelligence service providers (starting now referred to as "service providers") refer to organizations and individuals that provide generative artificial intelligence services.

Generative artificial intelligence service technology supporters (starting now referred to as "technology supporters") refer to organizations and individuals that provide technical support for generative artificial intelligence services.

Generative artificial intelligence service users (hereinafter referred to as "service users") refer to organizations and individuals who use generative artificial intelligence services to produce, copy, publish, and disseminate information.

In practice, service providers also provide generative artificial intelligence services with their own models and technologies. At this time, the service provider will also be a technology supporter, and the part about technology supporters in the following analysis will also apply to service providers and will not be repeated.

It is worth noting that although the Interim Measures only distinguish between two roles, generative AI service providers and generative AI service users, this does not mean that the Interim Measures do not impose compliance requirements on technical supporters. For example, Article 7 of the Interim Measures stipulates the training data processing activities of GAI, and Article 8 stipulates the data labeling during the development of GAI. In practice, if technical supporters complete the relevant activities, they should still comply with the corresponding provisions.

(II) Analysis of the responsibilities of various entities under different reasons

Unless the service provider, technical supporter, and service user have a prior conspiracy to intentionally create and use GAI to commit a criminal act, they can be held criminally liable in accordance with the joint crime. In general, service users are often unspecified and do not have the conditions to become a responsible entity. For example, building, renting, and selling VPNs is likely to constitute the crime of providing programs and tools for intrusion into and illegal control of computer information systems or the crime of illegal business operations. Although the use of VPNs is not currently subject to criminal law evaluation (it is still an unlawful act), if users use GAI or its output AIGC as a tool to engage in other criminal activities, such as using AI face-changing software to generate obscene pictures for sale and dissemination online, it has independent criminal punishability. Therefore, the following mainly discusses how service providers and technical supporters in the AIGC industry chain should be held responsible according to different circumstances when AIGC contains criminal risk content.

1. Problems with training data or model architecture

If insufficient information is generated due to problems with training data or model architecture, the technical supporter may be held criminally liable. However, considering that under my country's criminal law system, conviction is based on the principle of unity of subjective and objective factors, whether the technical supporter should bear criminal responsibility should also be examined to determine whether he has the subjective intention to generate or disseminate bad information.

The technical complexity of AIGC makes it difficult to directly determine whether the technical supporter is subjectively intentional. At the same time, my country's current laws, regulations, regulatory rules, and technical standards have put forward several requirements for training data and model training. In practice, it may be considered to infer whether the technical supporter has subjective intention by judging whether the technical supporter has fulfilled relevant legal obligations and implemented relevant technical standards. Specifically:

① Use data and basic models with legal sources

Article 7 of the Interim Measures requires “using data and basic models with legal sources.”

What is “data with legal sources”? Generally speaking, it should refer to data obtained through legal means, such as agreement acquisition, legal public collection, etc., rather than illegal means, such as stealing or collection without permission.

What is “basic models with legal sources”? Article 17 of the Interim Measures has put forward precise filing requirements for generative AI services with public opinion attributes or social mobilization capabilities. In addition, according to the window opinions we obtained from consulting the Cyberspace Administration of China, the current regulatory authorities have a relatively broad standard for determining “having public opinion attributes or social mobilization capabilities.” The filing obligation must be fulfilled if the service users include the general public. Therefore, whether the filing work is completed in accordance with the Interim Measures can be used as one of the reference standards for judging whether the relevant basic models are legal. Of course, the illegality of administrative supervision is not the same as the illegality of criminal offenses. However, the basic model for filing needs to be evaluated for security during filing. The use of such models can significantly reduce the risk of service users generating bad information.

② Ensure the quality of data annotation

According to Article 8 of the Interim Measures, when data annotation is carried out during the development of generative artificial intelligence technology, the service provider or technical supporter shall formulate clear, specific, and operational annotation rules that meet the requirements of these Measures and reflect the content of identifying illegal content in the annotation rules; the service provider or technical supporter shall also conduct data annotation quality assessment, sample and verify the accuracy of the annotation content; provide the necessary training for annotation personnel, enhance the awareness of respecting the law and abiding by the law, and supervise and guide annotation personnel to carry out annotation work in a standardized manner. Thereby reducing the possibility of being identified as having subjective intent.

③ Corpus security and model security requirements

In addition to the Interim Measures, the National Cybersecurity Standardization Technical Committee issued the Basic Requirements for Generative Artificial Intelligence Service Security on March 1, 2024, which puts forward requirements for corpus source security, corpus content security, corpus annotation security, model security, and security measures.

2. Lack of proper filtering and monitoring mechanisms

Whether it is due to problems with training data, model architecture, or input content, resulting in the content generated by GAI containing wrong information, service providers must set up proper filtering and monitoring mechanisms to prevent insufficient information from being output to service users.

Of course, the Interim Measures also propose that the principle of giving equal importance to development and security, promoting innovation, and combining governance according to law should be adhered to, and AIGC should be subject to inclusive, prudent classified, and graded supervision. Service providers should guide the setting of algorithms with correct values and reasonably exclude content containing bad information in the process of algorithm application according to the standards of general rational people so as to avoid absolute infringement results. On the one hand, for illegal/criminal problems that are caused by artificial guidance in algorithm design or training (for example, algorithm screening and filtering mechanisms are contrary to public order and good customs), strict liability standards should be applied to service providers; on the other hand, if service providers and technical supporters have set up proper filtering and monitoring mechanisms, if service users generate insufficient information that service providers and technical supporters cannot foresee during specific operations, it is not appropriate to overly blame service providers or technical supporters.

3. Problems with input content

Suppose the appearance of insufficient information in AIGC is indeed caused by the deliberate guidance of service users through their input (such as obscure or suggestive language), in such cases, since the service users have the subjective intention to commit the relevant content-based crime. In that case, it can be determined based on their specific behavior what kind of content-based crime they are suspected of.

In this case, the criminal liability of the service provider or technical supporter should also be based on the attitude of the service provider or technical supporter towards the illegal/criminal behavior of the service user, such as pursuit or indulgence. There are two ways to determine subjective intent in criminal offenses: one is active intent, that is, the actor hopes that the harmful result or danger will occur, and the other is indulgence intent, that is, the actor recognizes that his/her actions/inactions will lead to the occurrence of harmful results or dangers, but does not take measures to avoid them.

Suppose the case evidence is sufficient to prove that the service provider or technical supporter has active intent. In that case, there is no doubt that he/she should bear the corresponding criminal liability. However, the difficulty or complexity in practice usually lies in whether the service provider or technical supporter has indulgence intent and how to prove whether he/she has indulgence intent. In fact, due to the objective hugeness of AIGC input data, the complexity of the algorithm, and the black box effect caused by multiple iterations of the algorithm, even if the service provider or technical supporter strives to be legal and compliant during the training process, it is challenging to avoid illegal or criminal problems in specific usage scenarios. At this time, as a service provider or technical supporter, in practice, you can generally follow the following principles to perform the review and supervision obligations for the illegal behavior of service users on the AIGC products they operate and use this as proof that you do not have the intention to indulge in the illegal/criminal behavior of service users:

① Safe Harbor Principle

As a legal principle widely used in the field of copyright, the safe harbor principle means that when a copyright infringement case occurs, when the network service provider only provides space services but does not produce page content, if the network service provider knows the infringing content, the network service provider is obliged to delete the infringing content or take measures such as blocking and disconnecting links. If relevant measures are not taken promptly after clearly knowing the infringement facts, the responsibility must be borne. Although the safe harbor principle is a principle in the field of copyright, it still has a specific reference value when judging whether the technical supporter/service provider has intentionality.

② Red Flag Principle

The red flag principle is also a principle in the field of copyright. When the infringement facts are evident and "flying like a red flag," the service provider cannot pretend not to see it or shirk responsibility by saying he does not know about it. That is, according to common sense and the essential prudent obligations that should be fulfilled, the service provider should understand the existence of the infringement but not delete the link. The service provider should bear the infringement liability even if the right holder does not issue a deletion notice.

(III) The impact of the principle of technological neutrality

1. The impact of technological neutrality

In cases involving Internet crimes, when "technological neutrality" becomes the main defense point, service providers or technology supporters usually emphasize the legality and compliance of their product design, operation, and publicity and demonstrate the obligations they have fulfilled in terms of supervision and review. They often claim that they are subjectively unaware of the illegal behavior or have tried their best to correct the user's unlawful behavior after knowing it, and at the same time emphasize that objectively, due to factors such as technological limitations, it is not feasible to eliminate such behavior.

Technological neutrality has no special meaning in criminal law, nor is it a reason for acquittal. Technology itself should not be the object of criminal law evaluation. It is the act of providing technology that may constitute a crime. Whether providing technology constitutes a crime can only be determined based on the particular legislation of criminal law and the traditional theory of accomplices and return to the connection between subjective cognition and criminal intent. A typical example is the Qvod case, where the defense emphasized that “Qvod provided technical services and did not disseminate, publish, or search for pornographic videos, nor did it provide assistance; Qvod technology was not a tool specifically for distributing pornographic videos, but rather provided cache services to improve network transmission efficiency and provide users with P2P video on demand technical services; based on the principle of technological neutrality, the safe harbor principle should be applied to Qvod’s behavior, and Qvod should not bear criminal responsibility for disseminating pornographic materials for Internet users.” The prosecution provided evidence that Qvod’s dispatch server not only pulled pornographic videos to be stored in the cache server but also provided pornographic video files in the cache server to the client. This proved that Qvod was involved in the dissemination of pornographic videos, destroying the basis of its technological neutrality.

The Qvod case was ultimately classified as the crime of disseminating pornographic materials for profit. When Qvod was at its peak, it could almost be equated with pornographic video launchers. Its illegality was already apparent, and the “red flag principle” should be applied to attribute responsibility to it. Of course, there are also views in the academic community regarding the rationality of the sentence. If this case occurred after the release of the Criminal Law Amendment (IX), it would be more appropriate to characterize it as the crime of refusing to perform information security management obligations.

2. Consideration of joint crime or independent crime

From the perspective of legislative trends, the crimes of assisting information network criminal activities, refusing to perform information security management obligations, and illegal use of information networks added after the Criminal Law Amendment (IX) all reflect my country's optimistic view of criminal punishment for cybercrimes and AI-related crimes. Turning the accomplice into an independent principal in the traditional accomplice theory to cope with increasingly active Internet and AI-related crimes also puts higher compliance requirements for service providers and technology supporters. Even if there is no criminal intent to commit a specific crime between the user and the service provider, the service provider and technology supporter can be convicted through the subjective presumption of knowledge of the role of network products and AI products in promoting the implementation of criminal acts.

Specifically, with regard to AIGC, Article 9 of the Interim Measures requires service providers to assume the responsibility of network information content producers. In contrast, Article 14 explicitly requires that “if a provider discovers illegal content, it shall promptly take measures such as stopping generation, transmission, and deletion, and take measures such as model optimization training to rectify the situation and report to the relevant competent authorities. Suppose a provider discovers that a user uses generative artificial intelligence services to engage in illegal activities. In that case, it shall take measures such as warnings, restricting functions, suspending or terminating the provision of services to the user following the law and contract, preserve relevant records, and report to the relevant competent authorities.”

Article 10 of the Deep Synthesis Provisions also stipulates that “deep synthesis service providers shall strengthen the management of deep synthesis content, take technical or manual measures to investigate the use of deep synthesis services, and The user's input data and synthesis results shall be reviewed. The deep synthesis service provider shall establish and improve the feature library to identify illegal and wrong information, improve the standards, rules, and procedures for entering the library, and record and retain relevant network logs. Suppose the deep synthesis service provider finds illegal and wrong information. In that case, it shall take disposal measures by the law, keep relevant records, and report to the Internet Information Office and relevant competent departments in a timely manner; take warnings, restrict functions, suspend services, close accounts and other disposal measures for the relevant deep synthesis service users following the law and contract. "

It can be seen that my country's regulatory authorities currently have a high duty of care for AIGC service providers. Suppose the service provider fails to fulfill the above duty of care. In that case, the service user will use GAI to generate illegal content and will be ordered to correct it by the regulatory authorities.

Recently, the Jiulongpo District Internet Information Office imposed administrative penalties on Chongqing Chuchang Technology Co., Ltd., the operator of the local "Kaishanhou" AI writing website, for failing to fulfill its audit management obligations and its primary responsibilities. The "Kaishanhou AI Writing Master" website operated by the company illegally generated information prohibited by laws and regulations, failed to fulfill its primary duties, and violated the "Cybersecurity Law of the People's Republic of China" and "Interim Measures for the Management of Generative Artificial Intelligence Services" and other relevant laws and regulations. The Jiulongpo District Cyberspace Administration ordered the company to make comprehensive rectifications within a time limit, strengthen information content review, improve relevant information content security management systems, and suspend website information updates and AI algorithm-generated writing functions for 15 days.

IV. Conclusion

Based on general social cognition and social common sense, users who use AI face-changing and AI stripping software to create and disseminate obscene materials for profit have been subject to criminal prosecution, and issues such as AI voice chats containing pornographic content or generating obscene images have also begun to attract social attention. The social harm of these behaviors is apparent. However, in the context of imperfect laws and regulations, it is crucial to reasonably set the criminal liability of product providers. If the scope of the responsible party is too large or the responsibility is too heavy, it may be detrimental to the innovative development of the AI industry; if the scope of the responsible party is too small or the responsibility is too light, it may make AIGC products a tool for illegal crimes.

In addition, service providers, technology supporters, and service users in the external environment where legal rules are gradually improved should also respect the law, respect the public interest, establish compliance awareness from the source, and prevent risks caused by the lack of product compliance measures.

要查看或添加评论,请登录

Dr. Klemens Katterbauer的更多文章

社区洞察

其他会员也浏览了