Just a random question since I already have a different solution but I had the following situation.
I have a table with a billion rows (actually probably about 1.1 billion.)
I was trying to span the table, read every row and do an analysis.
Certainly couldn't load the entire table. I was using a paged query (limit/count). Each page took about 90 minutes for the query itself. So not really something that was going to allow me to do much analysis.
Any other ideas on spanning it or speeding it up?
At one point I was even considering just dumping it and writing an app to do the analysis outside of the database. That was about the only other solution I had.
The database was MySQL (AWS Aurora actually).
The relevant parts of the table were as follows and the id has a primary key. (I didn't design the table.)
I need to do the following
1. Verify that RefId1 and RefId2 are in a different table either in that order (1,2) or (2,1)
2. Report if neither or only one id is found.
3. Report if more than one match is found
4. If found report if a different column (not documented above) is same as in the second table.
The second table also has a billion rows.
Both tables have indexes on the id, and the two other columns documented above.
oh that is ugly - the 1,2 - 2,1 is going to cost you.
Can you do it in a couple of steps, inner join on the 2 layouts and eliminate them from the process, possibly even move the non matching records into another table for your reporting analysis (one assumes the majority have valid matching records).
Never underestimate the power of human stupidity
Paging was not part of the solution. Paging was necessary only because I could not read the entire collection into memory. However I did need to insure that all of the data was read and that the same row was not read twice.
That's why you need to have an index on the id. (or any unique column or combination of columns)
The index is already sorted per definition.
So the WHERE id > @PreviousMaxID just walks down the b-tree to the right id and starts counting leafnodes until it reaches @PageSize.
I don't know how AWS Aurora is organized, but if your table is clustered, you don't even need to lookup the pages and Bob's your uncle.
Hi - We migrated from sql server 2008 to 2012. We have a stored proc which inserts dates into a table. We insert an initial value wherein the identity key value is -1, with an insert statement. For the rest of the dates, we have a while loop. within the while loop, the next identity key value is resulting as 0 in sql server 2012 and as a 1 in sql server 2008. Both server have the table's field name defined with identity_seed as 0, and increment as 1.
Why does sql server 2008 have 1 as the next value after -1, and sql server 2012 have 0 as the next value after -1?
The fact you have a while-loop is filling me horror. However, without being able to see the SP code we can't possibly comment on what might be going wrong.
You have contradicted yourself however, you stated
We insert an initial value wherein the identity key value is -1,
but you state
Both server have the table's field name defined with identity_seed as 0, and increment as 1.
Without checking I can't confirm, but perhaps there was a hole in 2008 that allowed you to trample all over the IDENTITY column.
You should use IDENTITY (-1,1) if you want the first key to be -1 (but I suspect you don't).
Try posting the code so we can suggest ways of improving it ... like getting rid of the WHILE loop
If rows are present in the table, the next row is inserted with the new_reseed_value value. In version SQL Server 2008 R2 and earlier, the next row inserted uses new_reseed_value + the current increment value.
So if you reseed to 0, and your increment is 1, the next value in SQL 2008 R2 or earlier will be 1, whereas the next value in SQL 2012 or later will be 0.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
I am using a merge statement in my Stored Proc, when I am trying to get inserted values, its giving me all the matched and unmatched ones.
In the example below:
MERGE [dbo].[Service_Program_Detail] AS T
USING (select *, @PK_Target 'PK_Target'from [dbo].[Service_Program_Detail]
where FK_Program_Code_ServiceProgDetails = @PK_Source
and FK_Fiscal_Period_Code_ServiceProgDetails=@Fiscal_Period_Code) AS S
ON ((T.FK_Service_Code_ServiceProgDetails = S.FK_Service_Code_ServiceProgDetails)
WHEN NOT MATCHED
THEN INSERT(FK_Service_Code_ServiceProgDetails, FK_Program_Code_ServiceProgDetails
, FK_Fiscal_Period_Code_ServiceProgDetails, CreatedBy, ModifiedBy
, CreatedOn, ModifiedOn, FK_Service_Program_Detail_Unit_Types)
, S.FK_Fiscal_Period_Code_ServiceProgDetails, @ExecutedBy, @ExecutedBy,
getdate(), getdate(), FK_Service_Program_Detail_Unit_Types)
THEN UPDATE SET T.FK_Service_Program_Detail_Unit_Types = S.FK_Service_Program_Detail_Unit_Types
the output statement is inserting not matched values and matched values, but I want only not matched values to be inserted into @Service_Program_Detail. My question is, is there any possibility that I can output matched and not matched values separately or I want only not matched values (which I could able to do by putting only not matched condition, but I want to update the value as well but it shouldn't be inserted into @Service_Program_Detail and it should be able to go into a different value or table variable.
"There is already enough hatred in the world lets spread love, compassion and affection."
Any help to figure out this query is highly appreciated.
I have three tables. (Scripts to load test data copied below).
First table #UserAccounts has UserAccounts, AccountType and BillingAccountKey.
Second table #BillingTransactions had Billingtransactionkey(surrogate key), BillingAccountKey and BillingCode.
UserAccount and BillingAccountKey has 1 to 1 relationship.
Third Table #BillingCodeRank has BillingCode and Rank.
I want to see the billing code associated with each UserAccount (with account type 'O') that has the max rank.
Example:UserAccount 456 has BillingAccountKey =2 and has three billing codes (222,333,444) having rank 6,5,4 recpectively.
The result should show BillingCode 222 associated with UserAccount 456 because codee 222 has the max rank among the three codes.
The final result from the test data would be as shown below:
UserAccount AccountType BillingAccountKey BillingCode Rank
456 O 2 222 6
789 O 3 111 7
102 O 5 333 5
--Scripts to load test data
SELECT * INTO #UserAccount FROM (
SELECT 123 AS UserAccounts, 'I' AS AccountType, 1 AS BillingAccountKey
SELECT 456, 'O', 2
SELECT 789, 'O', 3
SELECT 101, 'I', 4
SELECT 102, 'O', 5) A
SELECT * FROM #UserAccount
SELECT * INTO #BillingTransactions FROM (
SELECT 1 AS BillingTransactionKey, 1 AS BillingAccountKey, 111 AS BillingCode
SELECT 2, 2, 222
SELECT 3, 2, 333
SELECT 4, 2, 444
SELECT 5, 3, 111
SELECT 6, 3, 555
SELECT 7, 3, 666
SELECT 8, 3, 222
SELECT 9, 5, 333
SELECT 10, 5, 777)A
SELECT * FROM #BillingTransactions
SELECT * INTO #BillingCodeRank FROM(
SELECT 111 AS BillingCode, 7 AS [Rank]
SELECT 222, 6
SELECT 333, 5
SELECT 444, 4
SELECT 555, 3
SELECT 666, 2
SELECT 777, 1
SELECT 888, 1
SELECT 999, 3
SELECT 101, 5)A
SELECT * FROM #BillingCodeRank
WITH cteOrderedBillingCodes As
ROW_NUMBER() OVER (PARTITIONBY T.BillingAccountKey ORDERBY R.Rank DESC) As RN
#BillingTransactions As T
INNERJOIN #BillingCodeRank As R
ON R.BillingCode = T.BillingCode
#UserAccount As A
LEFTJOIN cteOrderedBillingCodes As C
ON C.BillingAccountKey = A.BillingAccountKey
And C.RN = 1WHERE
A.AccountType = 'O'
I am stuck in the middle of a project. I am having an issue with adding the data source in"Entity Data Model Wizard"
Solution Explorer : Models(Folder): Add : New Item : ADO.NET Entity Data Model : EF Designer From Database : New Connection
Now here in"New Connection" I want to connect my postgresql database using Postgresql Datasource but I have only SQL Server datasource option there, Anybody know how to deal with this ERROR..??
Note: I have added Npgsql, entityframework5.Npgsql through NuGet.
Last Visit: 31-Dec-99 18:00 Last Update: 19-Mar-18 21:04