HyperMorph

HyperMorph

Why HyperMorph

Not just another ETL tool

There are many open-source tools written in Python for ETL or ELT process, petlbubblesmara-pipelinesmara-schemabonoboluigiodoetlalchemymETLrikocarrylocopyetlpypygrametl. Authors in many of these tools realized that Python developers need a uniform interface based on object oriented abstractions for commonly used operations.

Well-designed powerful OOP classes

HyperMorph offers interactive console programming and development with high-level OOP components tailored to cover all aspects of database management and analytics. HyperMorph is very rich in that aspect and provides DataSet, Table and Field classes for data resources, DataModel, Entity, Attribute classes for data models, SchemaGraph, SchemaNode, SchemaLink, SchemaPipe classes for metadata management, DataGraph, DataNode, DataLink, DataPipe for data management, Connector class for python drivers/clients and at the highest level of management we have ASET (Associative Entity Set is similar to Relation) and HACOL (HyperAtom collection).

Schema and data as objects and nodes on a hypergraph

HyperMorph goes one step ahead of the OOP design principle. It creates objects with 3D numerical vector identities and links them as nodes on a hypergraph. That graph is powered by graph-tool one of the best and fastest network analysis tools in Python. Hypermorph keeps separate schema information, i.e. metadata, from stuctured data (tupleshierarchicalgraphtable, etc). This unique feature offers the possibility to organize easily data resources and to build complex customised data models in order to digest data. Data integration (consolidation) requires to manage successfully the complexity of mapping data resources on a data model something that can be easily done when our objects are hypergraph enabled and have numerical key vectors to identify their exact location in the schema, data graph.

HyperMorph Connectors

Another fundamental difference of HyperMorph with ETL tools is on the Python DB Driver/Adapter side. The current release supports:

  1. Clickhouse-Driver
  2. MySQL-Connector
  3. SQLAlchemy with the following three dialects
  • pymysql
  • clickhouse
  • sqlite

On top of these drivers HyperMorph uses a Connector class to abstract and unify SQL command execution sql() in a functional way and wrap commands to extract metadata get_tables_metadata(), get_columns_metadata(). Transformation to tuples, json rows, tuples, columns, and pyarrow batch/table is taking place at this level. At this stage performance is a critical factor. In our design and implementation of HyperMorph connectors we are seeking to minimise the time delay and data transferring speed. Therefore the protocol of communication that is used in the python database driver/adopter is highly important.

Pipelines

This is a standard approach in ETL frameworks and a very useful one because in general pipelines are flexible and intuitive in programming. Hypermorph is not an exception we tried to make a difference here by designing same pipeline operators for fetching either data or metadata. For example there is an over() operator for projection and to_dataframe() for transformation to Python Pandas dataframe. We have even wrapped functional commands on pipelines so that you can choose between OOP (chaining) or functional style of programming.

Not only a data storage and transformations-analytics tool

There is another category of tools related with data storage (in-memory, on-disk), transformations and analytics processing, such as TileDBdatatablepandaspetlvaexpytablesibisnumpydaskpyarrowgandiva. Usually most of them construct a table data structure in-memory or on-disk and use either a column layout or row layout to process the data. Hence they resemble database engines. In fact previous prototypes of HyperMorph (see TRIADB project) were based on SQL database engines. This time the current, first, release of HyperMorph is powered by PyArrow. There are many reasons for that choice. Most important PyArrow is mature and provides a columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware including GPUs. But regarding HyperMorph the killer feature of PyArrow package is dictionary encoding which is utilized to implement associative filtering, part of our associative semiotic hypergraph technology, in the style of Qlik analytics engine.

More promising than data virtualization and cloud analytics services

In recent years there is also another approach for data management and analytics aiming to skip the weary ETL process. Usually these are SaaS products on the cloud, such as panoplydremioknowidenodo, and many others. They provide GUIs and act as middleware between DBMS and BI platforms. Naturally these are proprietary products and details on how they work under the hood are hidden. Developers or power users have to stick with menu-widget driven interfaces than having the ultimate flexibility of programming at the level of Python language. You may consider HyperMorph as an open-source API with the same role to fetch data for graph visualisation platforms. HyperMorph has three key differentiating points here data consolidation, user defined data modeling and interactive associative filtering for analytics with the option to visualize connected data on a graph. And because HyperMorph is open-source it is more promising that potentially our technology can be used from many software vendors for BI applications.

Speechless HyperMorph Screencast

Hypermorph speaks for itself

Watch the demo, check youtube settings and make sure video quality is at 1080p HD. You may also set the playback speed at 0.75 to increase the time of executing commands.

Now you know that you can …

and the only limit on what you can is your imagination.

Installation - Demo Test - Documentation

Step by step instructions

on how to install release.

Demo Guide to Test package

Demonstration of HyperMorph functionality on data resources and demo scripts that are included in the distribution.

Documentation

A draft of the documentation from comments in source code is generated automatically with Sphinx and it is hosted at GitHub.

要查看或添加评论,请登录

Athanassios Hatzis的更多文章

  • A binary number comparator in Python using AI

    A binary number comparator in Python using AI

    It is often assumed that AI can easily solve coding problems, but this is not always the case, especially when specific…

  • Lightning Fast Analytics and Graph Traversals with Datalog

    Lightning Fast Analytics and Graph Traversals with Datalog

    Is it possible, or better ask how is that possible ? Although graph databases and knowledge bases are becoming more and…

    1 条评论
  • Is there Truth in our world ?

    Is there Truth in our world ?

    Tristan Harris and Cathy O'Neil in this excellent documentary, Social Dilemma, 1h15m08sec, nailed down the problem…

    1 条评论
  • ChatGPT generates randomly improbable (faulty) responses

    ChatGPT generates randomly improbable (faulty) responses

    Recently Walid Saba, senior principal research scientist at Northeastern University made a post with this image to…

    5 条评论
  • Lockdown supporters called me a killer – they should be disgusted with themselves

    Lockdown supporters called me a killer – they should be disgusted with themselves

    I feel solidarity with Dr. Karol Sikora and every single person on Earth who raised his voice in resistance against the…

  • ChatGPT "understands" what is a mango

    ChatGPT "understands" what is a mango

    Finally, we have made significant progress in constructing exceptionally intelligent machines that understand…

    10 条评论
  • ChatGPT explains how it works (2/2)

    ChatGPT explains how it works (2/2)

    On 26th of March, a week after the first on-line "meeting" I had with ChatGPT, I decided to carry on with this second…

    1 条评论
  • ChatGPT explains how it works (1/2)

    ChatGPT explains how it works (1/2)

    On 18th of March 2023 I decided to interrogate Chat GPT-3.5 to reveal information about its internal processing.

    5 条评论
  • Integrating ChatGPT in Programming !

    Integrating ChatGPT in Programming !

    I am sure you know how difficult it is to start an article with an opening paragraph and my English are not perfect by…

  • The Third Reich

    The Third Reich

    Introduction It is in our nature. We were created in the image of God ! But let me reassure you that this is not a…

    1 条评论

社区洞察

其他会员也浏览了