|
What database server are you using?
|
|
|
|
|
|
Example of finding records in D1 that are not in D2
SELECT D1.[DepartmentID]
,D1.[Name]
,D1.[GroupName]
,D1.[ModifiedDate]
FROM [AdventureWorks].[HumanResources].[Department] D1
LEFT JOIN [AdventureWorks2008R2].[HumanResources].[Department] D2
ON D1.DepartmentID = D2.DepartmentID
WHERE D2.DepartmentID IS NULL
Notice how the table is referenced by Database.Schema.Table.
Use the where clause to compare individual columns.
|
|
|
|
|
vanikanc wrote: Is there a way to compare data of the same table in two different databases?
Yes! Multiple! What have you Googled?
On a more serious note, RedGate has a nice tool that could help out.
vanikanc wrote: Thanks!
You're welcome
|
|
|
|
|
Hello,
When I try the followinf in sql 2005 i get an error message:
---
declare @rc varchar(8000)
select @rc = 'bcp hqsqlsrvr.web_ref.dbo.anytable_USA out c:\anytable_usa.dat -c -T'
exec xp_cmdshell @rc
---
an error occured while processing command line.
Any pointers?
thanks.
|
|
|
|
|
Try this one :
declare @rc varchar(8000)
select @rc = 'bcp "select col1, col2 from hqsqlsrvr.web_ref.dbo.anytable_USA" out c:\anytable_usa.dat -c -T'
exec xp_cmdshell @rc
Putting the query in double quote may solve the problem..
|
|
|
|
|
Hi all,
A bit of quick info before I come to my question:
My company uses SQL Server 2008, two databases for one application and almost no schemas.
Our biggest customer now has a DB of 40 gig (took them a few years to get that 'big').
One of the databases is for storing production data and the other is for some data our application needs and user settings.
Now I recently had a discussion with my boss about one particular table in one of our databases.
It stores settings on stock, sales, client servers, user rights, user settings and general settings (no joke... a topic about it would even shame the hall of shame).
I told my boss that normalization might be very necessary for this table, but he told me that normalizing is not always a good thing. In the case of this particular table I know it is, but I am more interested in when we should not normalize?
One of my boss' arguments was that one table to store 'all' settings is rather handy because it keeps the amount of tables low and you can always find a setting (for whatever) in one table.
I know there are some rules for normalizing and denormalizing, but rules are meant to be broken
So I was wondering if my boss' argument about having lesser tables is valid at all?
And I would like to see some real-life examples of people who chose not to normalize and for what reason.
|
|
|
|
|
This is a very good question - in fact, there's a lengthy discussion on the subject in my favorite SQL tuning book[^].
To me, denormalization is always an exercise in query optimization. I start off with my "ideal" (i.e. normalized) schema, populate its key parts with random but realistic data, quickly prototype my queries, and then look for particularly ugly execution plans.
What your boss had in mind is probably minimizing the number of joins: if you always look for all (or a large subset of all) settings at once, and you keep doing it over and over again, using a single table for all settings will reduce the number of joins, and therefore improve your speed. However, I would not make a decision one way or the other without running a query analyzer: optimizing without a profiler is like driving in the dark without a map.
|
|
|
|
|
Your boss is right and you are right, there is no magic rule that can be applied, it all comes down to "depends on requirements".
If you have a single table with an expanding number of columns to hold the settings then I would change the design. I often use a SettingType, SettingValue table and store the lot in there. Hoewever you cannot use FKs in that scenario so it is less than ideal.
I do have 1 general rule and it is based on the database usage, if it is a transactional database for supporting applications I normalise as much as is reasonable. If it is a reporting database, this includes BI support DBs then normalisation goes out the window in favour of query optimisation. Try servicing Oracle BI, it will drive a relational database person nuts.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
There is no specific rule as to when to normalize/denormalize. The answer is "it depends"
In general it is a trade off between Query response vs Cost of data inconsistency. Normalized data keep the data consistent at a cost of query time. See Database normalization[^] and Denormalization[^] for some descriptions in wikipedia.
|
|
|
|
|
Yep, it depends on the situation. I try to normalize as much as I can and then deal with troublesome processes.
One technique I have used is when it comes time to perform a bunch of queries against complex data, I make a denormalized copy of the rows I need (I hesitate to say "temporary table"). This allows me to perform the queries without tying up the actual tables -- a little like a mini reporting database that gets created on the fly I guess. This has been useful in systems with a lot of updates that require frequent display to multiple users.
|
|
|
|
|
Thanks for the answers
So basically just normalise as much as possible until performance issues occur?
Since we use a relational database with lots of read AND write (orders, stock, etc.) it would be best to normalize where possible?
One thing I didn't get from the denormalizing DB article on wikipedia was the usage of views. Is making a view a way of denormalizing?
Because in a view you use joins too, but in the end you can just select from one table(view)
|
|
|
|
|
Naerling wrote: the usage of views
Temporary denormalization, kinda like I described, but more temporary.
|
|
|
|
|
Naerling wrote: One of my boss' arguments was that one table to store 'all' settings is rather handy because it keeps the amount of tables low and you can always find a setting (for whatever) in one table.
That's one argument. Where's the rest?
Naerling wrote: I know there are some rules for normalizing and denormalizing, but rules are meant to be broken
That in itself isn't a reason to break the rules in a production environment. You can always test it on a VM, if required.
First up, I'd collect a list of arguments on why it shouldn't be normalized. You bosses' argument isn't really an argument to skip the normalization-process, a database doesn't perform better if the table-count is lower. Quite the opposite is true; if the datacollection within that table has several fields that different parts of the system uses, then updating a setting could become a costly operation since it would also have to update several indexes - possibly blocking other updates.
I tend to normalize every entity in the database, so help me Codd, to BNF. The only reason to denormalize that I stumbled across was partitioning
I are Troll
|
|
|
|
|
'The rest' of his arguments are: "it's easy that all settings are in one table", "you always know where to find a setting", "normalizing isn't always a good thing" and so on
In practice this isn't even true, because the 'type of setting' is stored as a varchar which isn't really documented anywhere. So user settings can be found "where [category] = 'usersetting'" or "where [category] = 'setting'" and maybe "where [category] = options". And then there's a column called sub which specifies what setting it is.
Don't use a 'like' in this table, because 'setting' could also give settings on stock or sales. But if you do use '=' for 'sales' you won't find 'sales_statusses' etc...
Then there's the employee column which does not hold a FK to the employee table because in most cases there is no employee for a particular setting. So I recently found lots of settings for non-existant employee's... It's really great
Actually there is no key or constraint at this table at all!
There is good news though. It's a very old table which was created with the best of intents at a time that my company did not know very much about SQL Server (migrated from DBF). We only keep it because redesigning this part of our DB (and software) would cost to much time at this moment. My boss does not think the table is good, it's just 'not that bad'
|
|
|
|
|
Naerling wrote: "it's easy that all settings are in one table"
For whom? The computer doesn't care.
Naerling wrote: "you always know where to find a setting"
He wouldn't be able to find them if they were divided over two tables?
<religious rant>
By that same logic, let's put everything in a *single* table; that way you always know in what table your data resides. Let's put everything under a single schema, for a single user and use a single password to access that single table. How about using a single VARCHAR-column in that table? That way you know exactly what column to search on and what column to retrieve from that table.
</religious rant>
Naerling wrote: "normalizing isn't always a good thing"
That's not even an argument, it's an opinion.
Naerling wrote: Actually there is no key or constraint at this table at all!
..then why use a database at all?
Naerling wrote: There is good news though. It's a very old table which was created with the best of intents at a time that my company did not know very much about SQL Server (migrated from DBF). We only keep it because redesigning this part of our DB (and software) would cost to much time at this moment. My boss does not think the table is good, it's just 'not that bad'
On Error Resume Next
I are Troll
|
|
|
|
|
I couldn't agree with you more!
Luckily not every table in our DB is like that. It's really the worst example I could think of by far
Yesterday I made some DB design with 9 tables and 2-3 columns per table. I had expected my boss to say I should use less tables. Instead he said I should've made 1 extra (maybe he reads this forum too?)
Don't start about on On Error Resume Next...
I know some very nice .NET examples of that. But that's another topic
|
|
|
|
|
Naerling wrote: Instead he said I should've made 1 extra (maybe he reads this forum too?)
Would've been quite the surprise if he'd mixed into the discussion
Naerling wrote: Don't start about on On Error Resume Next..
I are Troll
|
|
|
|
|
So here I have a table, TABLE1, which I copied over from a different server:
SELECT COUNT(*) FROM TABLE1 gives me, e.g., 83977 records.
However, there's another table, TABLE2, which lives on the current server, and I want to know if TABLE1 and TABLE2 have identically the same data. That's kind of a different question than I am asking here, because that question depends on how the tables are indexed, keyed etc. For now, however, I do a
SELECT COUNT(*) FROM TABLE2 and it gives 83911 records.
So the two tables differ by 66 records. How do I say to the database, "OK, what are those 66 records?"
I tried a LEFT OUTER JOIN between TABLE1 (on the left) and TABLE2 (on the right) and I join on the tables' keys and filter on the right table's keys being NULL. I get 130 rows. I'm confused, shouldn't I have 66 rows?
Brian
Sincerely Yours,
Brian Hart
|
|
|
|
|
Assume table A contains keys:
a
b
c
d
and table B contains keys:
a
b
e
f
Both tables contain 4 rows.
However, an oouter join of the kind you describe would return 2 rows (c and d).
In your case, table 1 contains 130 keys which are not in table 2. Table 2 contains 64 rows which are not in table 1. Hence, the difference in size is 66 rows (130 - 64).
|
|
|
|
|
What you are really after is a union of what is in TABLE1, that's not in TABLE2, along with what is in TABLE2, that's not in TABLE1. I'd try the following
select * from TABLE1 t1
where not exists ( select * from TABLE2 t2 where t1.key = t2.key )
union all
select * from TABLE2 t2
where not exists ( select * from TABLE1 t1 where t2.key = t1.key )
You might want to create a view for this, depending upon the circumstances.
Chris Meech
I am Canadian. [heard in a local bar]
In theory there is no difference between theory and practice. In practice there is. [Yogi Berra]
posting about Crystal Reports here is like discussing gay marriage on a catholic church’s website.[Nishant Sivakumar]
|
|
|
|
|
Nice try, but TABLE1 and TABLE2 have completely different record counts (right, they are off by 66) so the database complains:
Msg 205, Level 16, State 1, Line 1<br />
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
Sincerely Yours,
Brian Hart
|
|
|
|
|
If that is the error message, that means the the * which is being selected in each case is different. Replace the * in my select with just the key identifier columns and it should work.
Chris Meech
I am Canadian. [heard in a local bar]
In theory there is no difference between theory and practice. In practice there is. [Yogi Berra]
posting about Crystal Reports here is like discussing gay marriage on a catholic church’s website.[Nishant Sivakumar]
|
|
|
|
|
It's not complaining because the number of rows don't match. It's complaining because the number of columns don't match.
See Chris's previous answer for how a suggestion on how to correct the query.
|
|
|
|
|
[Product plug]
We use Red-Gate SQL data compare for this job and it is excellent but a bit expensive for a single requirement.
[/Product plug]
Do an inner join between the 2 tables using EVERY column, this will identify the identical records, then select from the(s) where the Ids are not in the inner join.
Inspect this result and discard the joins that are not relevant and repeat.
When you have the critical records decide what you need to do to merge them.
Never underestimate the power of human stupidity
RAH
|
|
|
|