Sustain and Industrialize your Data Projects with Talend for your Company.
Talend

Sustain and Industrialize your Data Projects with Talend for your Company.

Hi all

As you probably already know, Qlik has announced the end of the Talend Open Studio. The free version of our solution that offers a graphical development environment that allows users to design, deploy and manage data integration processes without having to write code will no longer be available for download from the end of January. The official announcement is here.

Today I'm going to focus on the key features of the Enterprise version.

Listing components of the Enterprise Studio is does not give the good perspective to study capabilities of the the Enterprise version. One should consider the ease of industrialization and the sustainability of your critical flows.

ETL/ESB/Data Services

The Enterprise Studio is unified, i.e. it is the same studio that allows you to create:

  • Jobs (batch)
  • Routes (ESB)
  • Data Services (API's)
  • Data Quality Oriented Data Profiling
  • Big Data Streams (Spark, Spark Streaming)

Development features

Here are some of the core components that add value to the Enterprise version:

Joblets: A Joblet is a specific component that replaces an entire group of components in a Job. A Joblet can be used to break down recurring processes or complex transformation steps. Thus, it makes it easier to read more complex Jobs. It can be reused in different Jobs or multiple times in the same Job. Above all, this component facilitates maintenance

Example of Joblet

Connectors: Our connectors are regularly patched and updated, integrating the latest APIs and security mechanisms through our technical partnerships. In particular, the Enterprise version contains additional connectors, such as for SAP, for example. Talend recently added an Apache Iceberg connector to? support our customers who would like to use this solution. In addition, REST calls see the arrival of a brand new connector with paging and more.

Talend Data Mapper: Talend Data Mapper allows you to map complex XML,EDI,HL7,JSON data records and documents and then perform transformations. I see this component in a lot of retail customers or in those who have to interpret complex XML from ERPs that need to communicate with each other.

Talend Data Mapper

Dynamic Schemas: ?When the target schema is not known at design time, or when the target schema may vary depending on the specific instance of the database, dynamic schemas can be used to map the source data to these targets "on the fly." This reduces the number of flows that need to be created and managed.

Parallelization: To improve performance, you can partition the input data stream of a subjob into parallel processes on the same server to run concurrently.

Parallelization in a job

Integration of Talend jobs into Talend ESB routes: When doing ESB, it is essential to be able to rotate the emitted data so that it can be consumed by the target system. Talend Enterprise contains components for routes that allow you to integrate jobs into it.


DQ / Anonymization:? The DQ components of the Studio, including the anonymization and encryption components.

Data Quality

A complete Data Quality module is available in the Enterprise version. These modules allow you to:

  • Govern datasets and manage user access
  • Document, tag, rate datasets to increase user confidence in a "self-service" approach
  • Setting up standardization and quality rules simply
  • Profile datasets
  • Correct data on an industrial or ad hoc basis

All of this is done centrally for IT or decentralised for business lines.

A complete example here in English in the field of retail:

Talend Data Inventory: Data Inventory is designed to make it easier to control the quality of your data in the Talend cloud. With simple, automated tools to manage data and metadata, your users, data scientists, and data stewards can discover and improve the data they need in a shared, collaborative workspace.

  • Data Quality rules can be easily set up and used in the Studios or in the self-service modules to verify the data.

Data Quality Rules

  • The business types of the data are automatically identified by the AI embedded in the system and via a dictionary system similar to an MDM
  • A quality score, the Talend Trust Score, is calculated and monitored over time
  • Datasets can be shared with one click via an on-the-fly generated Odata API.

Talend Data Inventory

Talend Data Preparation:? This module allows users further away from the IT world to prepare/correct their data in self-service mode and with ease: https://www.talend.com/fr/products/data-preparation/

Talend Data Stewardship: This module allows data stewards to conduct data processing campaigns when "human" functional expertise is required to correct/deduplicate/complete the data.

Anyone can help certify, cleanse, and reconcile data, while delegating the most specialized tasks to subject matter experts when needed.

Talend Data Stewardship

All these modules are complemented by components for Talend jobs, which make it possible to industrialize the cleaning carried out by business users.

Designing, building, and testing REST APIs

A complete API creation module is provided with the Enterprise solution:

Creating the API or Swagger Contract with the Business with the Talend API Designer

  • Documentation
  • Testing on Data Samples

API Designer

Service Creation, Deployment, and Service Observability

  • With the Talend Studio connected to the Designer API
  • On Remotes Engines

Testing APIs and chaining nested calls to these APIs with the Talend API Tester.

Designing an API in the Studio

Devops

Collaboration and versioning solutions such as GIT*, Azure Devops, or AWS CodeComit integrate natively into the Enterprise version of Talend.

The Talend Management Console allows you to manage users, their roles, and their access to different projects, which are linked to a GIT account*

Poject creation in Talend Management Console

The security of access to projects and their content, jobs, routes, metadata is therefore managed. By logging in to the studio, a user can only access the projects they have access to.

In addition, the studio natively integrates push/pull/merge/commit and branch creation actions as well as conflict management during merges.

Built-in versioning management in Talend Studio Enterprise
Conflict management during a merge

This makes it possible to manage several BU's on the same platform, for example.

One of my clients works in 7 countries with several hundred developers, on a single platform.

CI/CD

The commercial version of Talend offers by default many API's that will allow the automation of the CI/CD chain

Let's say you have several hundred Jobs deployed and you want to change your version of Java. Such a chain makes it possible to recompile jobs/routes and redeploy them in a matter of minutes.

Talend Enterprise's CICD chain also allows you to integrate test scenarios to confidently redeploy your flows.

CICD with Talend

I should be mentioned here Talend Enterprise's ability to generate Docker images for Jobs, Routes, and APIs.

Industrialization

Talend Management Console

The Talend Management Console provides management, monitoring, and control capabilities for your data integration projects created using Talend tools.

Here are some of the key features of the Talend Management Console:

  1. User and access rights management: You can configure users and define their access rights based on roles and responsibilities within the organization.
  2. Onboarding Job Monitoring: The console allows you to monitor the execution of onboarding jobs in real-time. You can see the progress status, any errors, and other information related to the execution of tasks.
  3. Task scheduling: You can schedule onboarding jobs to run at specific times, which is useful for automating data integration processes.
  4. Environment Management: TMC makes it easy to manage integration environments, allowing jobs to be deployed and managed on different servers.
  5. Artifact Management: You can manage artifacts, jobs, routes, and web services from the console.
  6. Logging & History: The console records execution logs, which can be used to diagnose any problems and track execution history.
  7. Integration with other tools:? TMC can be integrated with other management tools, such as incident tracking tools, database management systems, etc.

In addition, the TMC is fully APIized and remotely controllable (https://api.talend .com)

Talend Management Console

Remote Engines, Hybridizations, Scalability and Fail-Over

Remotes Engines are the runtimes of the Talend Data Fabric. You can install them wherever you want, within your infrastructure, with the partner or cloud provider of your choice.

They can be clustered to scale and fail-over if an engine were to be shut down for x reason.

Remotes Engines have mechanisms to send logs to the Talend Management Console or to JMS queues.

Security

I was fortunate enough to meet Talend's SecOps team and see how securing your flows, endpoints, metadata, and connection strings requires specialized expertise.

The Enterprise version of Talend benefits from all the expertise of this team:

  • Common Vulnerabilities and Exposures (CVEs) are constantly induced by third-party JARs (Apache JARs and others) integrated into the platform. It is essential to carry out active monitoring and testing to detect and patch them. The Enterprise version delivers patches immediately and transparently: https://security.talend.com/ then scroll down to the Trust Center updates topic

  • The enterprise version of Talend offers a connection manager in the Talend Management Console by default, which allows you to securely store your connection strings and not have to embed them in your Jobs.
  • Remotes Engines are subject to enhanced security and a secure pairing method.
  • All Talend elements are encrypted "in motion" and "at rest".
  • When deploying in the cloud, Talend can be integrated with your VPC's Private Link.

Certifications & Compliance

  • SSAE 16 SOC2, Type II
  • ISAE 3402
  • Based on NIST 800-53 standards
  • Cloud Security Alliance Level 1
  • OWASP Top 10 development
  • GDPR, CCPA support/compliance

Migration, Partners, Professional Services & Support

The big “plus” of Talend in the Enterprise version is the pool of in-house or freelance consultants or partners, certified, speaking French and Customer Support.

For those who would like to migrate from the Talend Open Studio version, Talend offers an accelerated migration procedure and best practices.

Conclusion

An enterprise data project must be built on a sustainable, industrial, secure, supported, interoperable and complete solution.

Developing your critical flows to serve your business decision-makers on the Enterprise version of Talend is the choice of risk mitigation. A choice that more than 7000 customers around the world have already made.

Feel free to comment and let us know which feature is most important to you.

I hope to see you soon !

Peter Uren

Data Professional at Reapit

10 个月

Given the high price point of any paid version of Talend is one of the most expensive data integration tools on the market, comparitive to its features offered, I'll certainly be making sure our company switches to a different product and vendor ASAP

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了