Data as a Service: Key Solution Architecture Elements, Part I
marketingattitude.net

Data as a Service: Key Solution Architecture Elements, Part I

"The intrinsic value of DaaS is as a means to access a broad range of external data sources to power business processes, and augment in-house systems of record, mark it for growing market prominence."

- Tom Pringle, Ovum.
"Data-as-a-Service: The Next step in the As-a-service Journey", July 2014.

As I noted in my earlier piece, Data-as-a-Service (DaaS) as a commercial enterprise Cloud category, is here in force and rapidly gaining traction in market. Enterprises are paying for intuitive access to external provider data sets, combined with intelligent data management services, that is then consumed within the essential applications they use daily. Vertical-specific details, consumer segment demographic attributes and international account profiles are just a few examples of the data sets available in todays' broad data provider ecosystem.

But make no mistake - under the hood of elegantly intuitive consumer data delivery from a robust DaaS offering lies a complex, integrated set of platform services for data provider management and data quality controls. The latter includes as major subset of well-established Master Data Management (MDM) capabilities, and then expands on them with data searching and matching intelligence features. Let's briefly peel the onion on what these DaaS services comprise, starting with the layer set that serves Data Providers:

Layer 1: Data Provider Capabilities

  • Data Billing & Resource Management
    The capabilities here need to reflect and support the agreed upon DaaS business model with packaging & pricing, most likely by usage or by subscription: "By usage" charges customers small amounts for each record verified, while "By subscription" provides a consistent operating expense, and is often better for larger enterprises seeking to verify many thousands (if not millions) of records every month.
    And lastly, this billing and resource tracking module also needs to handle the agreed upon revenue share agreements between the DaaS provider and their external data providers, allowing for variation in terms & conditions with each of them.

  • Data Provider Onboarding & Updates
    Data providers vary widely in their technical prowess, with a spectrum of content management skills for structuring their initial data set delivery and regular ongoing updates. A strong DaaS platform must have the services suite to handle this format range, from well configured XML-mapped sets to more rudimentary full files that must be mapped (See next bullet section) and then hosted. These initial extraction processes pull in the data from the source, and performs basic validation checks to confirm that it is providing the the correct values for its elements.

  • Sourced Data Mapping & Configuration Management Standards: With the data provider onboarding, clear identification of a data provider's data set profile is key. The semantic meta intelligence from each of these input sources must include details on the data's true origin, how it has been treated & validated to date, the data sets' configured structure, and what are the associated rights for its use.

    The last set of meta attributes is critical for adherence to 1) approved legal rights of use; 2) consented rights by the user to whom the data pertains.
    NOTE: I will cover Data Privacy in detail in a separate piece - a meaty, timely topic that certainly warrants it! Dialogue controls for transparency and compliance, including an individual user's ability to opt-out on records pertaining to him/her, must also be built into the DaaS platform services set, and are core elements of the commercial DaaS value proposition to customers.

    The accompanying warehousing of these sources must include a series of meta level transformation services to maintain a consistent, master dictionary of all data provider data sets. These data sources each need to be mapped into this master glossary which will be utilized for indexing and then by Search services (within a basic user interface) that are invoked by the consuming applications.

  • Advanced Data Matching Engine
    This is the true heart of an excellent commercial DaaS offering, identifying matches between the existing customer data set exposed in the application of choice, and the selected data provider. This engine will likely utilize a combination of probabilistic rules and deterministic algorithms to achieve high confidence on matched record relationships, driving the ensuing clean updates or appending of the existing data set. Probabilistic matching uses statistical weight computations for a range of potential identifiers to calculate the probability that two records refer to the same entity (must meet a specific threshold) - "a successful match", "not a match", or "possibly a match". Deterministic matching utilizes a set of well-analyzed, programmed rules that direct matching and scoring logic to seek "exact matches".

    Probabilistic matching works well with a broader set of data elements, and is most effective if it is tuned to data sets that are highly representative of the broader sets of the existing customer data being DaaS-treated: This includes the individual element structures as well as their combinations that serve specific targeted applications, their workflows and the needs of the key personas using them. Deep understanding of these guides the design of the algorithms underlying the matching & searching rules.

    More simplistic deterministic matching works best if the existing customer data sets have unique identifiers (Example: Social Security number). But in the absence of these, which are often not disclosed due to corporate sensitivity, several different data elements (Examples: address, phone number, email address) must matched separately, and then summed up to reach a total match score.

    To achieve significantly better DaaS match and search result rates between the existing and external data provider sources, it is imperative that clean & standardization routines, attuned to the existing customer data set elements, be applied first against them to raise Data Quality, and to gauge the current level of Data Completeness: these data quality routines help ensure that identifiers for the same entity are consistent, and that "equivalence" values are well understood for better DaaS matching; for the latter, the matching rules need to account for whether or not specific data elements are populated for the compared records.

What's Next on DaaS??
Layer 2 Solution Architecture: Customer Capabilities

I will dive into these critical MDM platform services & more in my upcoming posting on Layer 2 DaaS Solution Architecture Capabilities for Customers, which builds upon and integrates tightly with what I've covered here on Layer 1 for external data providers. And I will also pen some thoughts on DaaS Data Privacy: Legal and Individual Rights. So please stay tuned, commercial DaaS fans!

Jack, Do you know of anyone that could help us map out our pricing model?

回复

要查看或添加评论,请登录

Jack Corsello的更多文章

社区洞察

其他会员也浏览了