|
Yes, sorry. The prime indicates the transpose. U should be upper triangular if C is positive definite, correct. Thus U' will be lower triangular.
Certainly C should have all ones on its main diagonal (and be symmetric) but U may not necessarily have all ones, no.
A quick calculation using the correlation matrix C:
1.0000 0.6000 0.3000
0.6000 1.0000 0.5000
0.3000 0.5000 1.0000
gives U as:
1.0000 0.6000 0.3000
0.0000 0.8000 0.4000
0.0000 0.0000 0.8660
|
|
|
|
|
Zepplin,
Your answer makes a lot of sense to me. However, I need to ask a different question. Suppose that I have n random numbers that were generated independently with mean 0 and variance 1. However, I want these numbers to be correlated by a given correlation matrix (this is the same as my original question) in addition I have two vectors u and v. I want the ith number generated to have mean u[i] and variance v[i]. Can I do this by doing the following?
1) Use your above algorithm to generate n correlated random numbers
2) Multiply the ith number by v[i] to give it the correct variance
3) Then add u[i] to the ith number to give it the correct mean.
If I do all this, will I get the distribution of numbers that I seek?
Thanks
Bob
|
|
|
|
|
Yes, this will work. To be perfectly clear, this method will work if the random numbers you are generating are being drawn from a normal/Gaussian distribution with mean 0 and variance 1. Bear in mind that as a result of the law of large numbers[^] you will probably require a large number of samples to get close to the mean and variance that you want. I would suggest a minimum of 1000 samples and go up from there. Also, if you want the variance to be, say, 0.3 then you have to multiply (a normal variable with mean 0 and variance 1) by the square root of 0.3.
To summarize, if U is your Cholesky factor, and X is distributed as a normal variable with mean zero and covariance matrix I (the identity) then:
U*X is distributed as normal with mean zero and covariance matrix C.
Also, if you want a normal variable Z with mean m and covariance matrix C then Z = m + U*X where X is normal with mean 0 and variance-covariance matrix I (the identity). To change the variance of a given column of X, simply multiply that column by the square root of the variance you want to use. Since these are linear transformations, they will also preserve your correlation structure.
modified on Friday, November 7, 2008 2:58 AM
|
|
|
|
|
For many varied reasons, I decided to write my own balanced binary tree library. One feature of my library that I haven't seen in other freely available libraries is "clusters", and a "recluster" function.
The problem that I am attacking is that as elements are randomly added to the tree, the tree rotations that automatically happen to keep the tree balanced tend to spread the edge pointers all over the place. The end result is that in a large tree, a typical traversal can require reading nodes from all over physical memory (or disk pages): a very inefficient use of cache. By clustering several subtree "near the root nodes" together in physical memory, tree traversals can be sped-up significantly by keeping nodes that always get visited near one another within physical memory (or on the same disk page). For example, in a properly clustered tree, with clusters large enough to hold 4 levels of records (1+2+4+8 = 15 records), a tree of height 16 (>45000 records) can be traversed in only 4 cluster reads instead of 16 reads that might be required of a non-clustered tree.
What I don't know is a good way to measure how "out-of-cluster" a tree is. I'd like to be able to determine when a tree needs to be reclustered. I'm sure there is good theory out there for this; and the idea doesn't seem to be that difficult, but I haven't run onto the solution yet. I've spent a little bit of time googling for a method, and probably a couple of hours thinking about it. Nothing yet.
Any ideas or helpful links?
David
---------
Empirical studies indicate that 20% of the people drink 80% of the beer. With C++ developers, the rule is that 80% of the developers understand at most 20% of the language. It is not the same 20% for different people, so don't count on them to understand each other's code.
http://yosefk.com/c++fqa/picture.html#fqa-6.6
---------
modified on Wednesday, November 5, 2008 11:33 AM
|
|
|
|
|
Have you ever considered B+ trees as implemented by "The C Database Tool Chest", Mix Software. Mix is still on the web - Google "Mix Software". Note that the term "database" is, in fact, just a B+ tree. The database is a tree of memory or file blocks which contain multiple keys and records.
I got this in '85 and have used it ever since. I got the source software (don't know if this is still available). Not Open Source, but any programs that are released can be used without any royalty, only restriction is that it must be linked into an object, no DLLs, and the neither the source nor the library can be released - only the executable with the embedded code. With the source, you can make any modifications you want.
I modified it (originally 16bit - used Borland C to compile) to now use VS2008 thus 32 bit and supports >4GB files. Major rewrite to unwind the INT usage which now is 32 bit - in 16 bit an INT was 16 bit. Many conflicts with structure layout, and old 16 bit databases couldn't be read with the 32 bit version, nor the other way around. I built my version where I could declare which size of INT I wanted so I had only one converter to work with. I created two converters (16 bit and 32 bit) which could read the appropriate data base (in ascending order) and output a flat file with the data it contained, and then read in the flat file and write the other bit sized database.
Dave.
|
|
|
|
|
Member 4194593 wrote: Have you ever considered B+ trees as implemented by "The C Database Tool Chest", Mix Software.
No, I hadn't; but I am familiar with Mix, and have used both their Power C compiler and their C utilities toolchest.
I'll have to back-up and reconsider whether this would be a good approach to my end goal (which was NOT to rewrite the balanced binary tree algorithms).
I originally rolled my own balanced binary tree routines because the ones I found on the web didn't support duplicate keys, nor multi-threading. The clusters were an after-thought (actually, they came about from a "Hey, I can use the tree structure instead of yet-another-roll-my-own exotic structure -- IF I can speed them up sufficiently -- perhaps reclustering would do it).
Thanks for the thoughts.
David
---------
Empirical studies indicate that 20% of the people drink 80% of the beer. With C++ developers, the rule is that 80% of the developers understand at most 20% of the language. It is not the same 20% for different people, so don't count on them to understand each other's code.
http://yosefk.com/c++fqa/picture.html#fqa-6.6
---------
|
|
|
|
|
David,
Good that you like David. You be David, I be Dave. Good name.
The blocks in the B+ tree do not exactly correspond to the clusters you use, but do tend to group records together. The interesting thing is that the tree is self leveling. All blocks are identified by a block number. All of the full records are on the bottom level of the tree, linked horizontally by their block numbers, and have an array of offsets in each block to the records, thus constitute an "array", making it easy to go from one record to another. Above the "leaf" nodes at the bottom is an interior node or a layer of interior nodes which only contain the keys (not the additional data) and each key is associated with the block number containing the child nodes below. The first block (block 0) is a header block containing linking info (first leaf block, last leaf block, i.e. first record and last record), and the current root node. Whenever a node (either a leaf node or an interior node)) fills up (not when it is found to be full when a new record needs to be added, but whenever it becomes full due to the addition of a record), it attempts to offload some records to the prior same level block or following same level block, and if that is not possible, the block is split with half of the records in one and the second half in the other and these two blocks are linked. This also creates a new record for the parent block pointing to the block that just split, which may also cause the parent to split, on up the tree until maybe the root block needs to split thus creating a new root and updating the pointer of the header block. The blocks are allocated in memory with a buffer pool which is maintained with an LRU list, and when the buffer pool is full and a new block is needed, the LRU block is written (if dirty) to the database file, randomly by block number, or just discarded and reused if not dirty. Blocks can thus be either in-memory, or else on-file. Blocks are not written to file unless they hit LRU status and a new block is needed, or at the end, when the tree is closed. Multiple trees can be opened at one time (256 of then sharing the buffer pool). I build an application that needed to create a massive tree, then discard it and process other records comprised from the record numbers only. This caused the tree to be written out, then discarded. I wrote a new function to purge a tree without writing by just modifying the header record to indicate that it was empty, writing that out and closing the tree, then returning all buffer pool records to the pool, then discarding the file at the application level. Saved a massive amount of time.
Dave.
|
|
|
|
|
This discussion has sent me looking into B+trees. Some of the little code I've found looks ridiculously simple (by comparison to my BalBinaryTree implementation). I have no idea of speed comparisons, but it makes sense that they would be fast or else RDBMS's wouldn't use them. Plus it seems they might actually be smaller since every record in my tree system carries 12 bytes of overhead.
Yes, the mix source is still available, but it sounds like the right source to have is Mix's (which of course requires a small $ outlay) plus your modifications.
My (junk) e-mail address is davidlqualls@yahoo.com
I don't check it every day; and it fills up with garbage pretty fast, but if you'll put "B+Tree" in the subject line, I'll be looking for it.
Thanks.
David
|
|
|
|
|
DQNOK wrote: which of course requires a small $ outlay
The operative word here is "small" not like VS, and the library an be included in an executable royalty free.
Dave.
|
|
|
|
|
David,
Sorry this has taken so long to answer. It got buried on my email with Christmas correspondence, and you never responded to this thread again, and I got off on 15 other tangents.
Did you ever get the Mix source? I can't legally send you any "fixed" source unless you have the original source.
How is your balanced tree going?
Dave.
|
|
|
|
|
Thanks for remembering me.
I looked into it and decided you were right; using the binary tree was the wrong approach. Binary trees just do too much memory thrashing.
I also got off on a really hot project and had to set it aside for a while.
I downloaded a couple of free BTree libraries, but I'm never happy with stuff I get from outside. They will have been developed with their own support libraries, which usually conflict with mine, or they are missing features I want, or are bloated with features I don't want, etc.
I really just wanted an in-memory index, without the burden of a disk-based system. However, I really should consider a product that has disk-based storage in case I want to expand later on.
I'm still interested, just haven't taken "the plunge" yet. It's not the money; it's the time committment in learning a new system and adapting all my stuff to their way of doing things.
I'm just responding to your post without having gone back and reviewed everything that was said to this point. I'm going to do that, plus, if you have anything else new to add, I'd be glad to hear it. Perhaps I should just go ahead and buy the source...
David
---------
Empirical studies indicate that 20% of the people drink 80% of the beer. With C++ developers, the rule is that 80% of the developers understand at most 20% of the language. It is not the same 20% for different people, so don't count on them to understand each other's code.
http://yosefk.com/c++fqa/picture.html#fqa-6.6
---------
|
|
|
|
|
A second response:
Have you noticed that Mix now sells a Win32 version of their database toolchest? I don't suppose you've compared their implementation against yours, have you? I know nothing except what they've posted on their web site.
David
|
|
|
|
|
David,
It would be interesting to see how they converted it, however, I have already done that and it works great and I don't need to spend the money to see what they did. The primary fix was to get rid of all of the int's and make them CBT defined types, each one different so you knew what you were dealing with, then define these types as either Signed/Unsigned Longs or Signed/Unsigned Shorts depending on whether you wanted a version to be able to read an old database.
If you do get the source, I can give you my fixes for a PurgeTree to quickly kill a tree that is no longer needed.
Dave.
|
|
|
|
|
My office is testing a bunch of folks who have applied for Java jobs. I don't know head and tails
of the language other than that its syntax is stolen from here and there (I'm still partyin' with
MFC/Win32, yay!). My buddy was the invigilator and got the questions' sheet. This was one of the
easiest assignments, and shockingly, only 3 out of the 14 people that attempted this problem finished
it correctly on time. Some took more than 45 minutes but got it right, and others didn't solve it at
all! I tried it for the hell of it beat the time by 29 minutes, finishing at 16 minutes (C/DevCPP),
which I guess is not bad. Post your times here, and see if you can bring it down below 16 minutes.
Q7. (Note: You cannot answer Questions 8, 11 and 14 if you are answering this question.
Pick another 2 from the rest).
Write a command-line program as follows:
First the program will take as input a +ve integer from the user, let's call it
the TARGET.
It will then take in a set of +ve integers from the user, and here the user gets
to specify how many integers he/she wants to enter. Let's call this set the
RANGE.
Now the program should compute as follows:
- Find all combinations in the RANGE, the sum of which add up to the
target. Print such combinations at the command prompt. In a combination,
repeats are not allowed. i.e. (x + x + y) = TARGET is forbidden. Every element
in the combination must be unique.
- In case a user enters a RANGE element greater than the target, it should be immediately
discarded and the user must be prompted for the element again.
- Order shifted cases don't count. i.e. (x+y) and (y+x) being printed
separately is unnecessary and MUST be avoided.
- In case no combinations exist, the program should report so.
- Add exception handling code where necessary.
Example Run:
Enter Target: 15
How many set elements?: 5
Enter Element1: 7
Enter Element2: 8
Enter Element3: 12
Enter Element4: 3
Enter Element5: 22
Sorry. That element is larger than your target. Try again.
Enter Element5: 4
Combinations are:
1] 7+8
2] 8+3+4
3] 12+3
Time: 45 mins.
(At the top of the source file, please comment in your ID details along with the reference
number at the back of the day-card you were provided by the examiner. Write your Full Name,
ID key, TDC number and most importantly the reference number. Good luck.)
ASP - AJAX is SEXY. PERIOD.
|
|
|
|
|
Took roughly ~4mins, the next_combination function (similar to next_permutation) may someday end-up in the STL, if so probably 2mins or so....
#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
template <class BidirectionalIterator>
bool next_combination(BidirectionalIterator first,
BidirectionalIterator k,
BidirectionalIterator last);
int main()
{
std::vector<int> v;
v.push_back(7);
v.push_back(8);
v.push_back(12);
v.push_back(3);
v.push_back(4);
int target = 15;
int combination_count = 0;
std::sort(v.begin(),v.end());
for(unsigned int i = 1; i <= v.size(); ++i)
{
std::vector<int> tmp;
std::copy(v.begin(),v.end(),std::back_inserter(tmp));
do
{
if (std::accumulate(tmp.begin(),tmp.begin() + i, 0) == target)
{
std::cout << ++combination_count << "] ";
for(unsigned int j = 0; j < i; ++j)
{
std::cout << tmp[j] << (j < (i-1) ? "+" : "\n");
}
}
}
while(next_combination(tmp.begin(),tmp.begin() + i,tmp.end()));
}
if (0 == combination_count)
{
std::cout << "No combinations found." < std::endl;
}
return 0;
}
template <class BidirectionalIterator>
bool next_combination(BidirectionalIterator first,
BidirectionalIterator k,
BidirectionalIterator last)
{
if ((first == last) || (k == first) || (k == last)) return false;
BidirectionalIterator i = first;
BidirectionalIterator ii = last;
++i;
if (i == last) return false;
i = last;
--i;
i = k;
--ii;
while (i != first)
{
if (*--i < *ii)
{
BidirectionalIterator j = k;
while(!(*i < *j)) j++;
iter_swap(i,j);
i++;j++;ii=k;
rotate(i,j,last);
while(j!=last){ ++j; ++ii;}
rotate(k,ii,last);
return true;
}
}
rotate (first,k,last);
return false;
}
|
|
|
|
|
My 10 mins. I really wanted to do something cool with yield, but the difficulty of nesting these really stuffed me. I wanted to recursively traverse the mask like a b-tree and yield out results along the way.
class SubsetSummer<br />
{<br />
static IEnumerable< long> SubsetSumCore(IList< int> values, int target)<br />
{<br />
long mask = (1 << values.Count);<br />
<br />
while(mask-- > 0)<br />
{<br />
int total = 0;<br />
for(int bit = 0; bit< values.Count; bit++)<br />
{<br />
if ((mask & 1 << bit) > 0) total += values[bit];<br />
}<br />
<br />
if (total == target) yield return mask;<br />
}<br />
}<br />
<br />
static IEnumerable< List< int>> SubsetSum(IList< int> values, int target)<br />
{<br />
foreach (var resultMask in SubsetSumCore(values,target))<br />
{<br />
var results = new List< int>();<br />
for (int bit = 0; bit < values.Count; bit++)<br />
{<br />
if ((resultMask & 1 << bit) > 0) results.Add(values[bit]);<br />
}<br />
yield return results;<br />
}<br />
}<br />
<br />
static void Main(string[] args)<br />
{<br />
var values = new []{7, 8, 12, 3, 4};<br />
<br />
foreach (var result in SubsetSum(values, 15))<br />
{<br />
Console.WriteLine(string.Join(" + ", result.ConvertAll(i => i.ToString()).ToArray()));<br />
}<br />
}<br />
}
(wow that really shagged my generics)
|
|
|
|
|
Sorry you have failed. The program does not meet the requirements. Where is the user input and validation?
If it does not need to meet them then this program will do the job and take a lot less time and thought.
class SubsetSummer
{
static void Main(string[] args)
{
Console.WriteLine("3 + 4 + 8");
Console.WriteLine("3 + 12");
Console.WriteLine("7 + 8");
}
}
|
|
|
|
|
You forgot to place this in the punintended namespace.
ASP - AJAX is SEXY. PERIOD.
|
|
|
|
|
I guess if you find user input and validation challenging then you can focus on those parts. This is maths & algs, not "C# 101".
|
|
|
|
|
Hi all,
I read about several methods for constructing an ellipse with straightedge and compass, and one seemed very peculiar to me. It was called the trammel method...and it's the only one that I can't prove. It seemed so simple, but I just can't prove it.
If anyone could find a proof for me or point me in the right direction?
And also if this is in the wrong section please kindly inform me which section it should be in.
Thanks.
|
|
|
|
|
What do you mean by 'proof'? Are you implying you want an algebraic proof of the trammel method?
There is a web page here[^] that explains how to do it using a trammel and a nice animated one here[^].
|
|
|
|
|
Hi,
the proof is rather simple:
let the coordinates of A,B and O be called (x,y), (xB,0) and (0,yO) respectively. Then (based on AB=b and BO=AO-AB=a-b):
xB= x*(a-b)/a
yO= -y*(a-b)/b
(congruent triangles, or rule of three)
Now triangle "B,O,origin" is rectangular, hence xB*xB + yO*yO = (a-b)*(a-b)
(Pythagoras)
substituting the earlier formulae for xB and yO, and simplifying yields
(x/a)^2 + (y/b)^2 = 1
which is exactly the equation of an ellipse with radii a and b. QED.
modified on Sunday, October 26, 2008 3:08 PM
|
|
|
|
|
Now I just feel stupid.
Thanks.
|
|
|
|
|
hi Dear,
I have one item and its Avg Selling Price is 10 and Qty is 10 which makes its value to 10 * 10 =100
so now if i do some purchase and get one more item on the Price of 4 then it makes Qty 10+1 =11
and Avg Selling Price 100 + 4 =104
while if if divide 104 / 11 = 9.45 so its Avg Price becomes 9.45 and if i multiply
9.45 *11 =103.95
while my stock value is 104 means loss of 0.05
any Suggestion
Thanks in Adv
regards,
Mirza Rahman
|
|
|
|
|
softdev_sup wrote: any Suggestion
Yes, use a better calculator or rational numbers.
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|