登录查看更多内容

June 02, 2023

Kannan Subbiah

FCA | CISA | CGEIT | CCISO | GRC Consulting | Independent Director | Enterprise & Solution Architecture | Former Sr. VP & CTO of MF Utilities | BU Soft Tech | itTrident

发布日期: 2023年6月2日

A Data Scientist’s Essential Guide to Exploratory Data Analysis

Analyzing the individual characteristics of each feature is crucial as it will help us decide on their relevance for the analysis and the type of data preparation they may require to achieve optimal results. For instance, we may find values that are extremely out of range and may refer to inconsistencies or outliers. We may need to standardize numerical data or perform a one-hot encoding of categorical features, depending on the number of existing categories. Or we may have to perform additional data preparation to handle numeric features that are shifted or skewed, if the machine learning algorithm we intend to use expects a particular distribution. ... For Multivariate Analysis, best practices focus mainly on two strategies: analyzing the interactions between features, and analyzing their correlations. ... Interactions let us visually explore how each pair of features behaves, i.e., how the values of one feature relate to the values of the other.?

Resilient data backup and recovery is critical to enterprise success

So, what must IT leaders consider? The first step is to establish data protection policies that include encryption and least privilege access permissions. Businesses should then ensure they have three copies of their data – the production copy already exists and is effectively the first copy. The second copy should be stored on a different media type, not necessarily in a different physical location (the logic behind it is to not store your production and backup data in the same storage device). The third copy could or should be an offsite copy that is also offline, air-gapped, or immutable (Amazon S3 with Object Lock is one example). Organizations also need to make sure they have a centralized view of data protection across all environments for greater management, monitoring and governance, and they need orchestration tools to help automate data recovery. Finally, organizations should conduct frequent backup and recovery testing to make sure that everything works as it should.

Data Warehouse Architecture Types

Different architectural approaches offer unique advantages and cater to varying business requirements. In this comprehensive guide, we will explore different data warehouse architecture types, shedding light on their characteristics, benefits, and considerations. Whether you are building a new data warehouse or evaluating your existing architecture, understanding these options will empower you to make informed decisions that align with your organization’s goals. ... Selecting the right data warehouse architecture is a critical decision that directly impacts an organization’s ability to leverage its data assets effectively. Each architecture type has its own strengths and considerations, and there is no one-size-fits-all solution. By understanding the characteristics, benefits, and challenges of different data warehouse architecture types, businesses can align their architecture with their unique requirements and strategic goals. Whether it’s a traditional data warehouse, hub-and-spoke model, federated approach, data lake architecture, or a hybrid solution, the key is to choose an architecture that empowers data-driven insights, scalability, agility, and flexibility.

领英推荐

Modern Data Engineering 101 – Benefits, Use Cases…

DataToBiz 4 个月前

Data Management News for the Week of December 13;…

Data Management Solutions Review 2 个月前

Want to Scale AI? You’ll Need the Right Data…

Absolutdata Analytics-an Infogain company 2 年前

What is federated Identity? How it works and its importance to enterprise security

FIM has many benefits, including reducing the number of passwords a user needs to remember, improving their user experience and improving security infrastructure. On the downside, federated identity does introduce complexity into application architecture. This complexity can also introduce new attack surfaces, but on balance, properly implemented federated identity is a net improvement to application security. In general, we can see federated identity as improving convenience and security at the cost of complexity. ... Federated single sign-on allows for sharing credentials across enterprise boundaries. As such, it usually relies on a large, well-established entity with widespread security credibility, organizations such as Google, Microsoft, and Amazon, for example. In this case, applications are usually gaining not just a simplified login experience for their users, but the impression and actual reliance on high-level security infrastructure. Put another way, even a small application can add “Sign in with Google” to its login flow relatively easily, giving users a simple login option, which keeps sensitive information in the hands of the big organization.

Millions of PC Motherboards Were Sold With a Firmware Backdoor

Given the millions of potentially affected devices, Eclypsium’s discovery is “troubling,” says Rich Smith, who is the chief security officer of supply-chain-focused cybersecurity startup Crash Override. Smith has published research on firmware vulnerabilities and reviewed Eclypsium’s findings. He compares the situation to the Sony rootkit scandal of the mid-2000s. Sony had hidden digital-rights-management code on CDs that invisibly installed itself on users’ computers and in doing so created a vulnerability that hackers used to hide their malware. “You can use techniques that have traditionally been used by malicious actors, but that wasn’t acceptable, it crossed the line,” Smith says. “I can’t speak to why Gigabyte chose this method to deliver their software. But for me, this feels like it crosses a similar line in the firmware space.” Smith acknowledges that Gigabyte probably had no malicious or deceptive intent in its hidden firmware tool. But by leaving security vulnerabilities in the invisible code that lies beneath the operating system of so many computers, it nonetheless erodes a fundamental layer of trust users have in their machines.?

Minimising the Impact of Machine Learning on our Climate

There are several things we can do to mitigate the negative impact of software on our climate. They will be different depending on your specific scenario. But what they all have in common is that they should strive to be energy-efficient, hardware-efficient and carbon-aware. GSF is gathering patterns for different types of software systems; these have all been reviewed by experts and agreed on by all member organisations before being published. In this section we will cover some of the patterns for machine learning as well as some good practices which are not (yet?) patterns. If we divide the actions after the ML life cycle, or at least a simplified version of it, we get four categories: Project Planning, Data Collection, Design and Training of ML model and finally, Deployment and Maintenance. The project planning phase is the time to start asking the difficult questions, think about what the carbon impact of your project will be and how you plan to measure it. This is also the time to think about your SLA; overcommitting to strict latency or performance metrics that you actually don’t need can quickly become a source of emission you can avoid.

Read more here ...

Today's Tech Digest

9,245 位关注者

CHESTER SWANSON SR.

Realtor Associate @ Next Trend Realty LLC | HAR REALTOR, IRS Tax Preparer

1 年

I'll keep this in mind.

1 次回应

Pradeepan Ganesh R.

People & Administrative Operations

1 年

Advaith Gvk

KRISHNAN N NARAYANAN

Sales Associate at American Airlines

1 年

Thanks for posting

查看更多评论

要查看或添加评论，请登录

Kannan Subbiah的更多文章

March 03, 2025

2025年3月3日

March 03, 2025

How to Create a Winning AI Strategy “A winning AI strategy starts with a clear vision of what problems you’re solving…
March 02, 2025

2025年3月2日

March 02, 2025

Weak cyber defenses are exposing critical infrastructure — how enterprises can proactively thwart cunning attackers to…
March 01, 2025

2025年3月1日

March 01, 2025

Two AI developer strategies: Hire engineers or let AI do the work Philip Walsh, director analyst in Gartner’s software…
Februrary 28, 2025

2025年2月28日

Februrary 28, 2025

Microservice Integration Testing a Pain? Try Shadow Testing Shadow testing is especially useful for microservices with…
February 27, 2025

2025年2月27日

February 27, 2025

Breach Notification Service Tackles Infostealing Malware Infostealers can amass massive quantities of credentials. To…
February 26, 2025

2025年2月26日

February 26, 2025

Deep dive into Agentic AI stack The Tool / Retrieval Layer forms the backbone of an intelligent agent’s ability to…
February 25, 2025

2025年2月25日

February 25, 2025

Service as Software Changes Everything Service as software, also referred to as SaaS 2.0, goes beyond layering AI atop…
February 24, 2025

2025年2月24日

February 24, 2025

A smarter approach to training AI models AI models are beginning to hit the limits of compute. Model size is far…
February 23, 2025

2025年2月23日

February 23, 2025

Google Adds Quantum-Resistant Digital Signatures to Cloud KMS After a process that kicked off nearly a decade ago, NIST…
February 21, 2025

2025年2月21日

February 21, 2025

Rethinking Network Operations For Cloud Repatriation Repatriation introduces significant network challenges, further…

See all articles

June 02, 2023

Kannan Subbiah

FCA | CISA | CGEIT | CCISO | GRC Consulting | Independent Director | Enterprise & Solution Architecture | Former Sr. VP & CTO of MF Utilities | BU Soft Tech | itTrident

A Data Scientist’s Essential Guide to Exploratory Data Analysis

Resilient data backup and recovery is critical to enterprise success

Data Warehouse Architecture Types

领英推荐

What is federated Identity? How it works and its importance to enterprise security

Millions of PC Motherboards Were Sold With a Firmware Backdoor

Minimising the Impact of Machine Learning on our Climate

Today's Tech Digest

9,245 位关注者

Kannan Subbiah的更多文章

社区洞察

其他会员也浏览了

Data Management News for the Week of November 8; Updates from Diliko, Impetus, Teradata & More

Scaling Through Data Mesh and Treating Data as Products

Data Lakehouse: Why It's Your Business's Lifeline

Data Mesh vs Data Fabric

Data Platform Architectures & Design Patterns: A Comparative Analysis

4 Ways to Ensure Your Data Lake Doesn’t Become a Data Swamp

Coalesce and Our Award-Winning Data Consultancy: A Partnership for the Future

Data Management News for the Week of August 26; Updates from Acceldata, BigID, Privitar, and More

Understanding the Data Vault Model: ABC to Advanced Strategies and Best Practices for Data Vault Modeling

Data Contracts, AI & Data Governance, Business Glossaries: Your March Data Roundup

A Data Scientist’s Essential Guide to Exploratory Data Analysis

Resilient data backup and recovery is critical to enterprise success

Data Warehouse Architecture Types

领英推荐

What is federated Identity? How it works and its importance to enterprise security

Millions of PC Motherboards Were Sold With a Firmware Backdoor

Minimising the Impact of Machine Learning on our Climate

Today's Tech Digest

9,245 位关注者

Kannan Subbiah的更多文章

March 03, 2025

March 02, 2025

March 01, 2025

Februrary 28, 2025

February 27, 2025

February 26, 2025

February 25, 2025

February 24, 2025

February 23, 2025

February 21, 2025

社区洞察

其他会员也浏览了

Data Management News for the Week of November 8; Updates from Diliko, Impetus, Teradata & More

Scaling Through Data Mesh and Treating Data as Products

Data Lakehouse: Why It's Your Business's Lifeline

Data Mesh vs Data Fabric

Data Platform Architectures & Design Patterns: A Comparative Analysis

4 Ways to Ensure Your Data Lake Doesn’t Become a Data Swamp

Coalesce and Our Award-Winning Data Consultancy: A Partnership for the Future

Data Management News for the Week of August 26; Updates from Acceldata, BigID, Privitar, and More

Understanding the Data Vault Model: ABC to Advanced Strategies and Best Practices for Data Vault Modeling

Data Contracts, AI & Data Governance, Business Glossaries: Your March Data Roundup