登录查看更多内容

Multi-instance Syncing and the Power of TM1py

Shane Bethea

Solution Architect | Cubewise North America

发布日期: 2022年10月14日

Have you ever found yourself needing to quickly synchronize a set of dimensions or cube data between your Dev and Prod Planning Analytics instances? Do you have PA instances in multiple regions that you wish you could easily keep in sync? Or perhaps you’re considering a shift from on-premise PA to a cloud-hosted environment – how would you keep dimensions and data in sync during the transition period?

I recently worked on a project with several Cubewise colleagues, where we migrated multiple on-premise PA instances to the IBM Planning Analytics cloud-hosted environment. Due to the size of the application, the client transitioned to PA cloud in a phased approach. Because of this, we developed a solution to keep the on-premise and PA cloud instances in sync so that everyone would see the same data.

You’ve probably guessed by now that we used TM1py and the TM1 REST API to accomplish the metadata and data synchronizations. That’s right – no pesky CSV files to manage here! TM1py synchronizations have been done in the past, but we took this solution to a whole new level and developed a Planning Analytics model called Python Control Center (PCC). PCC centrally organizes, manages, and methodically orchestrates every aspect of metadata and data synchronization across multiple PA instances. Genius!

The best part is, while this solution uses some very sophisticated and robust TM1py and Python techniques, you don’t need to be a Python expert to manage it. If you know Planning Analytics, you know Python Control Center. PCC is a PA model made up of a series of cubes and TI processes…that just so happens to have a series of Python scripts that do the backend work. This backend heavy lifting was done by Cubewise to make the model portable, so that it can be used in other PA environments.

PCC Modules

PCC is made up of 3 main modules:

Python Sync Metadata (PSM)

The PSM module is designed to control Python metadata synchronizations between multiple PA instances. It can check the synchronization status of dimensions and then perform a TM1py sync of dimensions that are misaligned.

Python Sync Data (PSD)

The PSD module is designed to control Python data synchronizations between multiple PA instances. It can perform a TM1py sync of cube data, from source to target instance, based on a stored MDX view definition.

Python Data Control (PDC)

The PDC module is designed to control Python execution of TI processes and chores on a specified PA instance. It can execute one or more chores or TI processes, passing parameters as needed.

领英推荐

Automate Data Pipelines: Python & GitHub Actions…

Analytics Insight? 9 个月前

The Ultimate Guide to Data Analytics Tools: Python, R,…

PFES 8 个月前

Mastering Python for Data Engineering: Tools…

ITVersity, Inc. 2 个月前

How PCC Works

All 3 modules work in the same way:

Settings and runtime information are stored in a settings cube by index (store as many metadata/data sync or TI/chore execution definitions in the settings cubes as you need)
TI processes loop through active indices in the settings cube to gather information needed about the synchronizations and pass that information to Python scripts
Python scripts receive a series of parameters from TI, including connection, logging, and transfer details, and then perform the synchronizations as necessary
Finally, execution details are logged to history cubes, such as success/failure messages and statistical information about the synchronization or TI/chore executions

Example Data Synchronization in psd Settings cube

Multi-threading for Better Performance

Each module uses multi-threading to improve performance in a couple of different ways:

The model can handle running multiple executions in parallel using the TI function RunProcess and a special Thread Tracker cube to control the number of parallel threads running at one time
The Python scripts have the ability to asynchronously transfer data by spawning multiple threads and split large data sets into smaller read/write chunks

If you would like to synchronize dimensions and data across multiple Planning Analytics instances, the Python Control Center solution is right for you. The PCC model is super portable and can be implemented on-premise or on any cloud-hosted environment. It can perform synchronization tasks across any on-premise or cloud-hosted PA environment exposed via the REST API port. If this sounds like an interesting solution, feel free to reach out to your local Cubewise office for more information.?

Cubewise Contact Page:

https://cubewise.com/reachout/

Max Yu

Trading Strategy Generator

1 年

Peaks may be an excellent tool to support TM1 data pre-processing for giant size of csv files. Peaks can support billion-row JoinTable using only 32GB memory. https://github.com/hkpeaks/peaks-consolidation/releases

Stephen Ellis

Professional Services Executive?Consulting ? Workday Adaptive Planning ? Team Leadership ? Customer Success ? Financial Transformation ? Anaplan / IBM Planning Analytics (TM1) ? Pigment ? Let's Connect?

2 年

Great article Shane Bethea

Joseph Pusztai

2 年

Very, very timely, Shane - I was talking to a Cubewise customer last week who wants to do exactly this!

1 次回应

Chris Wetzel

Regional Account Director

2 年

Love it how the team at Cubewise “embraces complexity”. #DoGoodTM1

查看更多评论

要查看或添加评论，请登录

Shane Bethea的更多文章

A Developer's Perspective - Navigating the Transition to TM1 v12

2024年1月23日

A Developer's Perspective - Navigating the Transition to TM1 v12

In the ever-evolving landscape of IBM Planning Analytics with Watson (“Planning Analytics”) development, the release of…

8 条评论

Multi-instance Syncing and the Power of TM1py

Shane Bethea

Solution Architect | Cubewise North America

领英推荐

Shane Bethea的更多文章

社区洞察

其他会员也浏览了

What are the benefits of using PySpark for Data Analysis?

Python

What SQL Analysts Need to Know About Python

Getting Started with Data Analytics Using PyArrow in Python

Automating Flight Data Processing with Apache Airflow, Docker, and Python

MI - ETLx: Incremental Extract and Load Module for Python

A Python Data Engineer’s Journey with Snowflake: From ingestion, transformation to operationalization - Doris Lee & Manuela Wei's session BUILD 2024

Essential Programming Languages for Data Engineering: Python, PySpark, and SQL

03. Unleashing the Power of Lists: Versatile Tools for Data Management and Manipulation in Python

Rust in Data Engineering Automation: Faster than Python, Safer than Go, and the Chai of Programming Languages

领英推荐

Shane Bethea的更多文章

A Developer's Perspective - Navigating the Transition to TM1 v12

社区洞察

其他会员也浏览了

What are the benefits of using PySpark for Data Analysis?

Python

What SQL Analysts Need to Know About Python

Getting Started with Data Analytics Using PyArrow in Python

Automating Flight Data Processing with Apache Airflow, Docker, and Python

MI - ETLx: Incremental Extract and Load Module for Python

A Python Data Engineer’s Journey with Snowflake: From ingestion, transformation to operationalization - Doris Lee & Manuela Wei's session BUILD 2024

Essential Programming Languages for Data Engineering: Python, PySpark, and SQL

03. Unleashing the Power of Lists: Versatile Tools for Data Management and Manipulation in Python

Rust in Data Engineering Automation: Faster than Python, Safer than Go, and the Chai of Programming Languages