You see things; and you say 'Why?' But I dream things that never were; and I say 'Why not?'

Sunday, June 28, 2009

Evaluation of fedarated databases.

Okay, there are no high end systems, no quad processor boxes, no RAID, and no dedicated network. It's an evaluation scenario at it's simplest form.

I horizontally partitioned a huge table (5 Million rows, not so huge anyways!) across two database servers with 2.5 million rows on each server. Idea is to compare the response times in the following two scenarios:
  • Having the table on one server (5 million rows)
  • Having the table spread across two servers (2.5 million rows on each server)
Here is the hardware configuration of both the servers:
Server1: Server1
OS: Microsoft Windows 2000 Advanced Server with SP1
System model: DELL OptiPlex GX1
Processor: Pentium III, 500MHz
RAM: 128 MB

Server2: Server2
OS: Microsoft Windows 2000 Advanced Server with SP1
System model: DELL OptiPlex GX1
Processor: Pentium III, 500MHz
RAM: 128 MB

Here is the table structure:
CREATE TABLE [dbo].[tblPriceDefn] (
[ConfigID] [numeric](19, 0) NOT NULL ,
[PriceListID] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ValidityDate] [datetime] NOT NULL ,
[Price] [money] NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblPriceDefn] WITH NOCHECK ADD
CONSTRAINT [PK_PriceDefn] PRIMARY KEY CLUSTERED
(
[ConfigID],
[PriceListID],
[ValidityDate]
) ON [PRIMARY]

Horizontal partitioning of the data:
For this evaluation purpose I took the tblPriceDefn table which has 5 Million rows. I divided the table into two equal halves depending upon the value of the PriceListID column. Stored the first half on the first server (Server1) and the second half on the second server (Server2). Then I created a distributed partioned view named View1 in the first server (Server1) which cobmines both the halves of the table tblPriceDefn. Here is the view definition:

CREATE VIEW View1
AS
SELECT * FROM [Server1].test.dbo.table1
UNION ALL
SELECT * FROM [Server2\inst1].test.dbo.table1


Performance testing:
The following queries are executed on the first server locally on both the base table and the partitioned view and the response times were recorded. Each query is executed 5 times and the average response time is recorded

Query 1:
--On the base table:
SELECT
PriceListID,
COUNT(*) [# of rows]
FROM
ArenaDBRel2.dbo.tblPriceDefn
GROUP BY
PriceListID
ORDER BY
PriceListID

Average response time on the base table with 5 million rows: 59 seconds

--On the partitioned view:
SELECT
PriceListID,
COUNT(*) [# of rows]
FROM
test.dbo.View1
GROUP BY
PriceListID
ORDER BY
PriceListID

Average response time on the partitioned view : 41 seconds
A gain of 18 seconds which is a 30% improvement in performance

Query 2:

--On the base table:
SELECT
DISTINCT PriceListID
FROM
ArenaDBRel2.dbo.tblPriceDefn
Average response time on the base table with 5 million rows: 45 seconds

--On the partitioned view:
SELECT
DISTINCT PriceListID
FROM
test.dbo.View1

Average response time on the partitioned view: 32 seconds
A gain of 13 seconds which is a 28% improvement in performance


Query 3:

--On the base table:
SELECT
PriceListID,
ConfigID,
Price
FROM
ArenaDBRel2.dbo.tblPriceDefn
WHERE
PriceListID = 'M05-05'
OR
PriceListID = 'M10-01'

Average response time on the base table with 5 million rows: 59 seconds
--On the partitioned view:
SELECT
PriceListID,
ConfigID,
Price
FROM
test.dbo.View1
WHERE
PriceListID = 'M05-05'
OR
PriceListID = 'M10-01'

Average response time on the partitioned view: 44 seconds
A gain of 15 seconds which is a 25% improvement in performance

Conclusion:
The federated database servers feature in SQL Server 2000 is a very useful one and we can be sure to see performance gains in the query response times. As I mentioned earlier, this is a very simple evaluation of the feature. I am going to try it out with more concurrent connections and complicated queries and will post more information on this site. Be sure to check back after a while! Please let me know if you have any suggestions regarding this page!

No comments:

Followers