To Couple or Decouple
Story
The Issue
I was recently asked to help troubleshoot a mission-critical application experiencing
performance issues.
I pulled a list of the most expensive queries.
The customer mistakenly thought TOP 1 would always be the fastest because it returned only
1 data row.
I explained the query to the customer.
The explanation showed the high expense of joining tables, grouping, and aggregates.
I then looked to see if they had indexes on the tables they accessed.
The primary table had 25 indexes to facilitate the expensive query.
I asked if the primary table was updated frequently.
The customer confirmed it was updated frequently.
I explained to the customer that when this table is updated, it locks the indexes so SQL can
update them. While this takes place, the data is not accessible to the other requests.
This is a textbook example of a concurrency issue.
Table sharing between two or more applications was never a good idea.
If you want your production system to perform like a well-oiled machine, it must not be
disturbed by outside influences like external requests for data.
Of course, we know the requirements to share data across systems will always be there.
I decided to evaluate the pros and cons of Data Table Sharing.
Analysis
Pros:
Data sharing and integration:
Multiple applications can access and manipulate the same data,
Facilitating data exchange and interoperability among different systems or domains.
Data consistency and integrity:
Multiple applications can rely on the same set of constraints and validations enforced
by the database, which can ensure data quality and prevent data anomalies or
conflicts.
Data management and maintenance:
Multiple applications can benefit from the same database administration and
maintenance tasks, including backup, recovery, tuning, migration, etc.
Cons:
Data coupling and dependency:
Multiple applications can become tightly coupled and
dependent on the same database schema, limiting the flexibility and scalability of
each
application and introducing risks of breaking changes or compatibility issues.
Data contention and concurrency:
Multiple applications can compete for the same database
Resources, such as locks, connections, transactions, etc., which can degrade the
performance and
availability of each application and cause potential deadlocks or conflicts.
Data security and isolation:
Multiple applications can expose the same database to different
levels of access and threats, which can compromise the protection and privacy of
the data and require more complex and fine-grained authorization and auditing
mechanisms.
One possible way to mitigate some of the issues is to use a service layer or an API to abstract
and mediate the access to the database and to implement appropriate caching, partitioning,
replication, or synchronization strategies to balance the load and ensure the consistency of the
data. Another possible way is to use separate databases or schemas for each application and
use web services or other integration methods to share the expected data. Ultimately, the
choice depends on each scenario's specific needs and trade-offs.
I thought about it and realized that the requirement is not to share the data table but to share
the data.
We should avoid sharing physical resources.
If application A updates its data, we must send the updates to
application B and vice versa. The data store (database) is irrelevant.
Solution
Active Integration allows customers to implement this solution parallel to the production
system in about 1 week.
In week 2, we begin to run regressions to ensure the data in both the production system and
the parallel system look correct.
In week 3, we can change the application's connection to be decoupled.
We have decoupled a complex system in 3 weeks.
It is a very big deal. It can be very simple to accomplish with huge benefits.
What we do
We work with you to define the problem.
We propose solutions that will meet your requirements.
We implement your solution for you.
We manage the entire process from end to end.