A Theory of Data with Practical Legal Implications (with an Invitation to Comment & Share)

A Theory of Data with Practical Legal Implications (with an Invitation to Comment & Share)

An Invitation to Comment & Share: This article is meant to serve as a forum for discussing coherent notions of data and related concepts, with the hope of formulating a comprehensive theory that can ground policy discussions around data rights and obligations. I welcome--and truly hope to receive--comments from people in a variety of backgrounds and disciplines. Public comments (below) are ideal, because they encourage an evolving discussion. But, I'm happy to receive private comments as well if you'd like to message me directly. While this discussion will help me to refine the theory presented below for a more traditional publication in a law journal, I believe we should not be stuck in old ways of thinking about the exchange of ideas with so many new ways to participate in meaning discussions at our disposal. So, please share your ideas and send this article to your thoughtful friends and colleagues so they can comment too. Together, we can develop a better way to think about data and, thereby, a better way to deal with data's legal implications. Thanks, Parker


INTRODUCTION

Data is considered one of the most valuable asset classes for the twenty-first century enterprise. This awareness will intensify as data continues to fuel the digital economy and stimulate multi-billion dollar transactions. Yet, data’s value proposition is just one concern in debates about data’s role in modern life. Other concerns include personal and social issues related to privacy, access to information, and freedom of expression. These concerns reflect conflicting perspectives, which meld together to obscure data’s proper legal treatment. As a result, recent legal discourse has focused on how to strike the right balance between competing interests when formulating a data governance regime and, more specifically, accompanying rights in data.

Rights in data are widely misunderstood by attorneys and non-attorneys, alike. This misunderstanding is excusable given politicians’ and scholars’ failure to develop a comprehensive legal framework for dealing with rights in data (properly understood). Without such a framework, the legal community cannot articulate coherent, practical and theoretically-sound data rights, because the intellectual scaffolding for doing so is missing. This article aims to fill that void.

This article examines data’s nature, data’s relation to its associations (i.e., data’s elements, means of expression, processing, and accuracy), and data’s legal treatment in practice. In doing so, this article’s primary contribution is to present a systematic way to think about data and its associations that is useful for delineating related rights and conducting further legal analysis. More broadly, this article provides a logical framework upon which coherent data policies can be built.

This article proceeds by presenting a definition for data and distinguishing data from its associations to facilitate legal analysis.


DATA & ITS ASSOCIATIONS

Policymakers must articulate data’s nature and relation to other concepts (labeled “associations”) to successfully evaluate the merits of its legal treatment, both in practice and in theory. That effort is nothing new, as philosophers have wrestled with models of information since at least the Hellenistic period. Since then, enduring questions have tended to center on how information (or data), as a distinct concept, stands in relation to its content, expression, use, and accuracy. This Part tackles these questions for a clear functional purpose: to distill data’s meaning, distinguish its many associations, and thereby outline key concepts to assist further legal inquiries.

A. Data Defined

In a legal sense, data is an abstract proposition regarding an object of consideration’s state. The state may designate, for example, that object’s act, composition, kind, quality, relation to other objects, or position within a system for coordinating perceptions. When the object of consideration is placed in that context, the resulting idea is subject to factual examination in theory but must be expressed in a comprehensible way to be utilized in practice. For this reason, data is often expressed in a physical medium that facilitates processing activities, such as storage and transmission. While such expressions are what intelligent beings and processes technically interact with when dealing with data, data is an idea independent of any physical manifestation.

In a legal sense, data is an abstract proposition regarding an object of consideration’s state

B. Data’s Associations

The above notion is useful for delineating rights and obligations related to data because it distinguishes data from data’s legally significant associations, including: (1) data’s elements, especially the data subject; (2) the means of expressing data, both in a general sense (i.e., mode of expression) and in a particular instance (i.e., medium of expression); (3) the analytic operations performed on data directly (i.e., reasoning) and indirectly via its physical manifestations (i.e., processing); and (4) the connection between data and the empirical universe or some defined logical system (i.e., accuracy). Each such association and its basic legal significance are discussed below, in turn.

1. The Elements of Data

For an idea to be considered data under the above definition, it must both: (i) reference an object of consideration, and (ii) posit something about that object’s state. Both of these elements are essential. Without context, the object of consideration is a mere theoretical placeholder. Without reference to an object, a potential state is a mere instrument of comprehension. Thus, the object of consideration must be put in context (or, said differently, the context must be anchored by the object of consideration) to conceive a proposition subject to factual examination that is properly called “data.”

(i) The Data Subject

For particular data points, the object of consideration is the data subject. The data subject may reference someone or something that exists in reality or that acts as a reference point in some hypothetical system. For this reason, a data point may not completely articulate every feature of the data subject’s discoverable state and, thus, may not serve to fully identify the data subject in a broader sense. However, when a data point is combined with a set of related data points and considered in the right context, enough information may be presented to examine the data subject in more detail and with superior empirical accuracy. This feature expands a data point’s potential uses by increasing its explanatory power.

The relationship between a data point and its data subject is legally significant due to its potential real-world uses. The data subject may be an actual person or a thing in which an actual person has an interest. So, data points may have direct links to actual persons, who hold legal rights and have politically significant expectations. This link is a concern for two diverging reasons: (a) information revealed by data points may be used in a way that harms related persons; and (b) another’s access to that information may serve some acceptable, or even socially desirable, function. Whether these reasons for concern apply to a particular data point ultimately depends on the information it reveals about the data subject (i.e., the posited state) and how that information can be used.

(ii)?The Posited State

Like the data subject, the posited state articulated by a data point may or may not align with reality or some logical system. That alignment ultimately depends on the data subject, because the data subject anchors the posited state to the empirical or logical environment in which the hypothesized situation is examined. Consequently, this examination into the posited state is often secondary to an inquiry into the nature of the data subject.

The relationship between a data point and the posited state is legally significant due to the real-world implications of its accuracy. Actual persons or processes may rely on the posited state’s factual accuracy when utilizing the related data. As a result, processors (or their controllers) may be harmed when they process data that does not align with expectations for factual accuracy. Similarly, another’s processing of inaccurate data may unfairly harm data subjects or those with interests in data subjects. Thus, concerns regarding data subjects and posited states are intertwined.

In short, data’s elements each bring about legal concerns that help to shape the broader issues revolving around data, namely use and accuracy. However, before fully considering how data is used or the extent to which it can be considered accurate, it is important to consider how data is expressed and, thereby, comprehended.

before fully considering how data is used or the extent to which it can be considered accurate, it is important to consider how data is expressed and, thereby, comprehended

2.?The Means of Expressing Data

Data is an abstract proposition that must be expressed to be comprehended. There are two levels to this requirement. First, in a general sense, data must be expressed in a mode of expression that is theoretically comprehensible, like linguistic or pictorial representation. Second, in a particular instance, data must be expressed through an experiential/tangible medium of expression that facilitates actual comprehension, such as a digital file or painting. Each of these levels present unique concerns from a policy standpoint.

(i) The Mode of Expression

As for the mode of expression, its connection to data is tenuous. For one, the same data may be expressed through various modes of expression and, thus, no particular mode of expression is essential. Secondly, a particular mode of expression may not convey the same data to all processors in all contexts and, thus, modes of expression are imperfect. Finally, a mode of expression may provide tangential utility independent of the data it conveys (or is designed to convey), such as aesthetic or functional utility.

For the above reasons, the social and economic value of modes of expression are not necessarily tied to the data such modes convey (or are designed to convey). Instead, modes of expression, as a distinct form of association, tend to be valued for their aesthetic or functional qualities. Thus, legal regimes governing the modes of expression are often unconcerned with the protection of related data.

The modes of expressing data are legally significant, because their creation usually involves some effort and resource allocation. Despite such an investment, an author that expresses data in a particular mode may not realize the full benefit of that expression, because a mode of expression (distinct from any physical manifestation) is technically non-excludable, can be enjoyed by others at relatively little marginal cost, and has utility associated with positive externalities. Thus, the law may need to step in to incentivize the creation of modes of expression to achieve an optimal level of development.

No alt text provided for this image

(ii) The Medium of Expression

As for the medium of expression, its connection to data is secondary to practical considerations regarding its tangible form. For one, a medium of expression is subject to possession, allowing for potential access to or exclusion of the expressed data. Secondly, a medium of expression facilitates isolated processing activities, with no bearing on data processing that utilizes other mediums of expression. Finally, the medium of expression’s connection to the data it conveys (or is meant to convey) is dependent on the embodied mode of expression, especially to the extent it embodies a mode of expression of particular aesthetic or functional qualities. Thus, the medium of expression presents concerns that precede (and may supersede) concerns regarding expressed data.

For the above reasons, a medium of expression’s social and economic value may correlate the expressed data’s use value, but such correlation depends on extraneous factors. This feature complicates the task of coordinating policies for protecting data with policies for protecting the mediums of expressing data. Thus, legal regimes directly governing mediums of expression as property are distinguishable from legal regimes indirectly governing expressed data based on tort law.

Any medium of expressing data is legally significant, because it is composed of tangible resources that are susceptible to allocation. Like all other tangible resources, a medium of expression: is finite; may be possessed, transferred, and excluded from access; and has inherent value, at least in terms of opportunity costs. Thus, the law must provide a scheme for allocating such resources and governing their use.

No alt text provided for this image

3.?Data Processing

Data’s connection to its medium of expression facilitates processing activities, in the broad sense. Processing activities include: (1) capturing data in a medium of expression; (2) storing the captured data in one or more mediums of expression; (3) transferring expressed data to another’s possession through a medium of expression; (4) destroying an instance of expressed data by destroying its medium of expression; and (5) using expressed data to make a decision, capture new data, or for some other purpose. Each of these processing activities presents unique policy concerns and is discussed below, in turn.

(i) Data Capture

Data is captured when it is expressed through a comprehensible medium. Data capture may occur through different processes, such as automated processes (e.g., IoT device recordings), intentional processes (e.g., someone taking a photograph), or biological processes (e.g., human experience). However, regardless of the process, data capture’s result is always (by definition) of the same kind: the representation of data in a comprehensible manner.

Data capture, as a distinct processing activity, is legally significant because it may involve an invasion of privacy or proprietary rights. In this sense, the concern is that the capture method may involve an invasion of another’s reasonable expectation of privacy or a physical trespass on another’s tangible property. This concern may not fully capture an aggrieved person’s ultimate worry about the content of captured data and how it will be used, but the concrete actions associated with data capture necessarily precede potential data uses. Thus, the legal concern with data capture (at the point of capture) remains with how data is captured (rather than what data is captured), albeit with an eye towards why data is captured and how the captured data will be further processed.

No alt text provided for this image

(ii) Data Storage

Captured data may be stored in one or more mediums of expression. In this sense, data storage is the fixation of a mode of expression in a tangible medium that facilitates an intelligent being’s or intelligent process’s comprehension of expressed data. Although data is technically stored at the moment of capture, this article focuses on the types of data storage that enable expressed data to be comprehended by more than one individual or process through experiential or technical means.

These types of data storage, as distinct processing activities, are legally significant because they cause the expressed data to be susceptible to unauthorized processing. In this sense, these types of data storage present security concerns related to expressed data. For this reason, those who control or possess stored data may be required to implement administrative, physical or technical safeguards to protect it. Thus, the legal concern here is with stored data’s susceptibility to processing by different processors.

No alt text provided for this image

(iii) Data Transfer

Data may be transferred among processors through mediums of expression. Data transfers may involve the physical transfer of data storage media (e.g., handing over a USB flash drive containing a digital file) or the use of one or more transient mediums of expression (e.g., sending a smoke signal). This conception of data transfers includes any manner in which data may be intentionally communicated to another processor, including verbal disclosures, but does not include mere alterations in data storage. Thus, the primary focus of data transfers is on the processors (i.e., transferors and transferees), not the utilized transfer method.

Data transfer, as a distinct processing activity, is legally significant because it allows intended transferees to further process the expressed data in unauthorized or unanticipated ways. The concern is not that the transfer may not be secure (which is technically a data storage concern) but, instead, that the intentional transfer itself may be unauthorized by the correct controller or otherwise induced by improper means. For example, the transferor may be contractually prohibited from disclosing the expressed data to its intended transferee, or an intended transferee may have received disclosure authorization through fraudulent means. Thus, the legal concern here is with control over who is given the ability to further process expressed data.

No alt text provided for this image

(iv) Data Destruction

An instance of expressed data is destroyed when the medium of its expression ceases to exist or is no longer capable of facilitating its comprehension (in an irreversible way). This concept is not to say that the abstract proposition is destroyed or that the expressed data cannot survive in other mediums of expression. Instead, the concept is that an intelligent being or process can no longer comprehend the data that was once expressed through a now non-existent or irreversibly-altered medium of expression.

Data destruction, as a distinct processing activity, is legally significant because it may cause the loss of all reliable forms of expressed data. As a practical matter, it may be costly or impossible to capture the same data by alternate, acceptable means. Such a scenario can be problematic when the lost data helps processors to understand a broader context or to complete a desired task. Thus, the legal concern here is with loss of the once-expressed data’s processing potential.

No alt text provided for this image

(v) Data Use

Data capture, storage, transfer and destruction are all secondary to data use, which involves inputting expressed data into an analytical process that yields some ultimate result (e.g., a decision). Consequently, the important feature that distinguishes use from other processing activities is that use (i.e., intentional reasoning with particular data points) operates on data directly, rather than indirectly through data’s medium of expression. Although tangible expression is a prerequisite to data use, data (the abstract proposition) is the ultimate raw material for such usage via analytical processes. In this sense, data’s end is usage.

The above understanding is legally significant because it pin-points the primary concern regarding data processing in general as well as the practical concern regarding data, itself, in debates over the proper legal regime for governing it. In each case, the concern is control over how data (regardless of its factual accuracy) is used or can be used. While there may be an impulse to attempt to ground rights in data separated from its medium of expression, doing so ignores the fact that concerns rest with data usage, not data in a vacuum, and usage relies on mediums of expression. Thus, the legal concern here is with the practical control of outcomes of processing data, independent of its content or its factual accuracy.

No alt text provided for this image

4. Factual Examinations of Data

Data is subject to factual examination in theory. As an abstract proposition, a data point presents discrete positive (or negative) claims regarding a data subject’s state. Such claims may be tested in the empirical or logical environment in which the data subject is positioned. The focus of such an examination depends on the sense in which data may be considered accurate.

(i) Empirical Accuracy

Data may be accurate in the empirical sense, meaning the data subject’s posited state comports with experiential reality. Experiential reality, however, does not lend itself to infallible examination. Instead, the empirical accuracy of any given data point is often clouded by diverse or incomplete perceptions regarding the expressed data or relevant circumstances.

This type of accuracy is also somewhat misleading in practice because it assumes that instances of expressed data will be interpreted in a certain way. However, data must be expressed to be conceived, and expressions often convey different messages to different processors. Consequently, when discussing the empirical accuracy of data, it is often important to view the expressed data from an identified audience’s perspective. This shift in focus has the odd result of bringing a hypothetical inquiry to bear on an empirical question. Thus, for practical purposes, empirical accuracy may not be as definitive as the term suggests and often cannot be simplified to a binary choice between “true” and “false.”

The above understanding is legally significant for three main reasons. First, expectations regarding the empirical accuracy of data often determine how such data will be processed, raising the concern that false data will be used to a processor’s or other related person’s detriment. Second, different processors may attribute different data to the same mode of expression, creating an opening for ambiguous or misleading modes of expression to propagate. Third, a practical evaluation of empirical accuracy is inherently procedural and does not lend itself to absolute truths. For these reasons, the law may be concerned about the empirical accuracy of both the data meant to be conveyed and the data actually conveyed with a given mode of expression, but must address such concerns indirectly through procedural means.

No alt text provided for this image

(ii)?Logical Accuracy

Data may be accurate in the logical sense, meaning the data subject’s posited state comports with the logical implications of given assumptions. Such assumptions may specify parameters for some conclusion that triggers a pre-determined consequence outside of the logical system. Thus, logical implications of data may yield practical outcomes.

This type of accuracy places the assumptions and processes of the underlying logical system at the forefront of consideration. Assuming empirical accuracy (if necessary), logical accuracy or inaccuracy will necessarily follow from given assumptions and processes. Because that accuracy may have practical consequences, the design of the logical system will be of upmost concern.

The above understanding is legally significant, because logical systems are often used to make economically, politically and socially important decisions, albeit sometimes covertly. The assumptions and processes used by such systems may treat a data subject’s characteristic in a way that society views as illegitimate or unfair. Thus, the law may have an interest in restricting the types of assumptions or processes used to make important decisions or, at least, exposing such assumptions and processes.

No alt text provided for this image


C.?Policy Questions

The above discussion regarding data’s and its associations’ theoretical nature presents important questions regarding how data should be treated in practice. Central to these questions has been the issue of whether data can be considered a form of property and, thus, the object of rights distinct from any rights regarding its associations. This issue, however, yields an even deeper, often ignored, issue: whether data is even susceptible to a rational legal regime’s direct governance. I will explore this issue and related legal concerns in a future publication.


? 2021 Parker N. Smith

* Parker is an attorney and the founder of CoreServe Legal, LLC, a law firm based in New Orleans, Louisiana, USA. Parker’s practice primarily focuses on helping clients with intellectual property and information technology transactions and advising clients on ancillary matters related to technological innovation, data privacy, and analytics.?



要查看或添加评论,请登录

Parker Smith的更多文章

社区洞察

其他会员也浏览了