Dataiku
Dataiku is a platform for building, managing, and deploying data and AI projects. Dataiku is used for a variety of applications, including customer segmentation, fraud detection, customer scoring, deep learning, and natural language processing.
Dataiku is a platform that accelerates the democratization of data. Overall, as a machine learning platform, Dataiku is easy enough to use that it can be utilized by citizen developers, but robust and customizable enough that you can accomplish whatever you need to on the platform.
It's designed to help data professionals collaborate on tasks such as:
Dataiku's capabilities include:
Main features of the Dataiku platform
Integration & Connectivity of Dataiku DSS within other infrastructures
The platform integrates with Hadoop, Spark, SQL, Teradata, and is available on the AWS, Azure and Google Cloud platform marketplaces.
The detection of data schemas and formats is automatic. Thus, Dataiku is able to natively recognise a numerical variable, a character string, an age, a date, or even a geographical location.
Moreover, there is a decorrelation between data storage and processing: the data stays where it is. Access to data is therefore instantaneous and without the need to transfer data for processing.
领英推荐
Plugins
Dataiku DSS comes with standard visual components to connect to data, process and train models. But Dataiku also offers the flexibility to implement custom components, package them and share them with others. These custom components are available as plugins. Each plugin consists of both a graphical user interface and a backend programmed by the developer in R or Python.
There is a gallery of more than 100 plugins in the Dataiku Plugin Store, providing data applications in many areas such as language translation, weather, recommendation systems, data import/export and ready-to-use graphical interfaces.
Optimised data preparation
The graphical interface of Dataiku DSS accelerates data wrangling with interactive data cleansing and enrichment. Contextual transformations are automatically suggested by Dataiku according to the type of data. For example, from a date, Dataiku proposes to calculate an age. From an address, Dataiku is able to extract the street number and name, the postal code or the city. There are more than 80 visual processors that can be activated with a few clicks and without code. This graphical console also allows, with simple clicks, to interact with the data for filtering, transformations or statistical summaries.
Integrated development
Many languages are supported by Dataiku DSS: Python, R, Scala, PySpark, SparkR and SparkSQL, SQL, Hive, Pig and Impala. Dataiku is therefore aimed at all types of users whatever their technical background and at all levels of expertise.
Machine learning & AI
The platform includes a complete graphical interface (called Datalab) dedicated to the development of machine learning models. This interface allows the configuration of models, the visualisation of model performance and a simplified reading of the results produced by the algorithms.
Collaboration & Governance
Dataiku DSS incorporates features to optimise sharing and exchange within data teams and business teams. These include project management, chat, wiki and versioning tools.?
For data governance, the platform provides a centralised catalogue of data, comments, elements and models. In addition, all user activities are shown on a dedicated dashboard and security is guaranteed by other features (such as, for example, permissions management, log management or monitoring of data size and instance activity). Dataiku meets all data governance and auditing requirements.