Click here to Skip to main content
15,921,941 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
HI - I have a vb.net application that needs to read a "hypercube" data file - format is header (32768 char) then Array (640,640,120) UShort. The c code to read this runs pretty fast:
C#
void CCubeReader7Dlg::OnBnClickedOk()
{
FILE *F_IN, *F_OUT;
char* header;
unsigned short* data;
unsigned short* value;
unsigned short* out;
data = (unsigned short*)malloc(520*sizeof(short)); // dimension single row of data array
out = (unsigned short*)malloc(2*696*128*sizeof(short)); // dimension output storage arrage
header = (char*)malloc(32768); // dimension header
// Open and input and output file
fopen_s(&F_IN,"C:\\Documents and Settings\\Leif Hendricks\\Desktop\\xyzzy.cube","rb");
fopen_s(&F_OUT,"C:\\Documents and Settings\\Leif Hendricks\\Desktop\\xyzzy.out","wb");
// Read past the 32768 byte header
fread(header,sizeof(char),32768,F_IN);
// Loop over the columns, bands, reading a row at a time (visually rows and cols are reveresed)
for(int i=0;i<696;i++)
for(int b=0;b<128;b++){
fread(data,sizeof(short),520,F_IN);
out[i+b*696] = data[519]; // store data from column 519
out[i+b*696+696*128] = data[204]; // store data from column 204
}
// Write the stored data out to a file
fwrite(out,sizeof(short),696*128*2,F_OUT);
// Close the output file
fclose(F_IN);



I have written this in vb.net (with more than a weeks worth of help from Microsoft technical support) - but this runs really slow - I believe the streamreader is the problem. In short, I want to read a one record file, and then be able to quickly scan the 3 dimentional array for selected cell values, and then record the index locations in a separate file. Help woule be really appreciated.

Thanks

Bob


VB
Sub LoadCubeData()
       Dim MaxRows As Integer = 640
       Dim MaxColumns As Integer = 640
       Dim MaxBands As Integer = 120
       Dim r As Integer = 0
       Dim b As Integer = 0
       Dim c As Integer = 0

       Dim array3D(MaxRows, MaxColumns, MaxBands) As UShort '49152000
       Dim arraysize As Integer = MaxBands * MaxRows * MaxColumns * 4 + 32768
       Dim myString As String = "No Data"
       Dim myCharString As Char() = ""
       Dim StringLength As Integer = 0
       Dim rowbuffer(4) As Char
       Dim colbuffer(4) As Char
       Dim linebuffer(4) As Char
       Dim returnValue As Integer = 0
       Dim currentPosition As Integer = 0
       Dim HeaderCharString(32768) As Char
       Dim CellValue(arraysize) As Char
       Dim CellValuePosition As Integer = 0

       Dim sr As StreamReader = New StreamReader(InputImageName)
       returnValue = sr.Read(HeaderCharString, currentPosition, HeaderCharString.Length)
       myString = HeaderCharString.ToString
       currentPosition = returnValue
       Dim iter As Integer
       iter = 0

       For c = 0 To MaxColumns
           For r = 0 To MaxRows
               For b = 0 To MaxBands

                   array3D(c, r, b) = 0
                   sr.ReadBlock(CellValue, currentPosition, 2)

                   
                   array3D(c, r, b) = Convert.ToInt32(CellValue.GetValue(currentPosition))
                                                       
                   currentPosition = currentPosition + 2

                   iter = iter + 1

                   Trace.WriteLine("brc" & "," & b & "," & r & "," & c & "'" & iter)
                   XPositionOut.Clear()
                   XPositionOut.AppendText(c)
                   YPositionOut.Clear()
                   YPositionOut.AppendText(r)
                   BandPositionOut.Clear()
                   BandPositionOut.AppendText(b)
                   CellValueOut.AppendText(Convert.ToString(array3D(c, r, b)))

               Next
           Next
       Next
       sr.Close()
   End Sub
Posted

Your code appears significantly different: you have three nested loops, where the original has only two.

However, if you need it really quick, why not leave it in Native code and call it via Interop? You get the best of both worlds then: known working, fast code, usable in a .NET environment. Since it doesn't return anything, or affect any external variables (that I can see) it shouldn't be a problem...
 
Share this answer
 
Comments
rfrank5356 12-Apr-11 16:24pm    
Wow - that was a fast response - thanks -

I posted the wrong c code - here is the code from the hardware mfg

<pre lang="cs">integer*2 datacube[640][640][120] // array of two-byte integers (column, row, band)
byte dummy[32768);

fread(in, sizeof(byte), 32768, dummy)
for(int column=0;column<640;column++){
for(int band=0;band<120;band++){
for(int row=0;row<640;row++){
fread(in, sizeof(integer*2),1,datacube[row][column][band]);
}
}
}</pre>


I have looked at the interop and it looks like a complex process to implement.

More on the next post
Espen Harlinn 12-Apr-11 17:01pm    
Nice reply, 5ed!
You still need the class System.IO.StreamReader. You can rethink the whole I/O to get better performance but one way to improve it can be this: read the whole file at once into memory through System.IO.StreamReader.ReadToEnd and operate with the read data thereafter. Of course, if the volumes of your data allow…

[EDIT]
You operate too big volumes, in this case you need different approaches.
Let me see…

—SA
 
Share this answer
 
v2
Comments
rfrank5356 12-Apr-11 16:30pm    
Hi - I have a Imports System.IO statement which seems to make Streamreader visable. I suspect that the read to end might be the solution, but when I tried to implement it - I ran into problems defining the data. I thought a fixed length record would be a solution but ran into problems with initializing the aray, type conversions etc. Could you suggest a data structure that would work? Thanks
- bob
Sergey Alexandrovich Kryukov 12-Apr-11 19:02pm    
As you can see, we discussed your problem with Espen. As I can see, reading and writing of this structure can be real fast. You just to need to read it all in once chunk or in big chunks. It would go slowly if you do your nested loop and read by one element. BinaryReader is correct, but it won't accelerate things because you already read in a binary way.

You see, your structure is very simple. My first question is: do you really want to keep all the hypercube in memory (arount 100MB)? Can the size be different? If you want to do it at once, you can really read it at once, as an array of rank 1 of ushort. Or maybe you need to keep the hypercube in memory and read some pieces on demand? It's also not too slow.

Whatever you want, it can be done fast.

And what would be the access for your hypercube (from memory or disk)? Random-access I guess. Just give me the idea what's the usage. Is is sparse?

--SA
rfrank5356 12-Apr-11 16:32pm    
The data will always be a single record file, and I plan to use dispose to free up memory after each read. The ultimate file sizes will be in the 100MB range, and I have a control file tht lists the files to be processed.
Sergey Alexandrovich Kryukov 12-Apr-11 16:42pm    
What do you want to dispose to free up memory. In .NET you can dispose, but the memory is reclaimed by CG. You probably mean unmanaged memory which can be reclaimed indirectly as a result of disposing it it release some unmanaged handles....

I need time to answer to your previous comment. I certainly know some answers, but we need to discuss them in some iterations.
--SA
Espen Harlinn 12-Apr-11 17:02pm    
Good point, 5ed!
Use BinaryReader[^], it has the BinaryReader.ReadInt16[^] method.

Use a FileStream and since you don't care about the first 32768 bytes set Position to 32768 before passing it to the BinaryReader constructor.

The performance should be excellent ...

Whatever you are doing here is fairly expensive since you are repeatedly converting r,b,c to text
Trace.WriteLine("brc" & "," & b & "," & r & "," & c & "'" & iter)                   
XPositionOut.Clear()                   
XPositionOut.AppendText(c)                   
YPositionOut.Clear()                   
YPositionOut.AppendText(r)                   
BandPositionOut.Clear()                   
BandPositionOut.AppendText(b)                   
CellValueOut.AppendText(Convert.ToString(array3D(c, r, b)))


just declare three integers r, b and c - there seems to be no reason in your vb code for all the other stuff since the final destination for the data seems to be CellValueOut, and friends, inside the inner loop.

>> with more than a weeks worth of help from Microsoft technical support
I sincerely hope you don't have to pay for this

Regards
Espen Harlinn
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 12-Apr-11 17:46pm    
Agree, my 5. The performance more depends on how big chunks of code one uses and buffering (so, actually the code performance is a bottleneck, not disk operation). StreamReader actually can read binary as well. It all depends on serialization.
Now OP asks for data structures to be suggested to re-design it. It would need some extra information and thinking.
--SA
Espen Harlinn 12-Apr-11 17:56pm    
Thanks SAKryukov! OP probably doesn't need any data structures - just the integers. As OP is talking about 100MB files - he should look at MemoryMappedViewAccessor, part of the System.IO.MemoryMappedFiles Namespace
rfrank5356 12-Apr-11 18:28pm    
No -I didn't have to pay - but it still cost me as I have spun my wheels :)

The extra code is just to display progress - the final will reference a parameter file and look for cell values within a set range in particular bands in any row/column.

Looking at the MemoryMappedViewAccessor description - Back in a few
rfrank5356 12-Apr-11 19:37pm    
This is looking promising - but I am running into a structure challenge - I am defining
Public Structure SpectralBands
Public SpectralValue() As UShort
End Structure
and I want the length to be 120 occurances to the spectral band.


Here is where I am at

Dim TotalFileLength As Long = 49184768


Using CubeData = MemoryMappedFile.CreateFromFile(InputImageName, FileMode.Open)
Using BandReader = CubeData.CreateViewAccessor(HeaderCharLength, 0)

Dim SpectralBand As UShort()
Dim i As Long = 0

Do While i < TotalFileLength
BandReader.Read(->> lost here
Loop
End Using

So I have a data structure and type problem - I'll hit this again in the am as I have one more project to solve today (aviation crab control software . . :))
Espen Harlinn 13-Apr-11 4:32am    
Use UnmanagedMemoryAccessor.ReadInt16
MemoryMappedViewAccessor is derived from UnmanagedMemoryAccessor
Hi all - thanks for all the help - I seem to have this mostly working and the speed is acceptable.

Here is the working code

Public Sub LoadCubeData()
Dim d1 As DateTime = DateTime.Now
Dim MaxRows As Integer = 640
Dim MaxColumns As Integer = 640
Dim MaxPixels As Integer = 640
Dim MaxBands As Integer = 120
Dim r As Integer = 0
Dim b As Integer = 0
Dim c As Integer = 0
Dim p As Integer = 0
Dim Length As Long
Dim OutputfileNumber As Integer = 2
Dim OutputFileName As String
Dim DataLine As String

OutputFileName = Replace(InputImageName, ".cube", "_Band15.csv")
Try
FileOpen(OutputfileNumber, OutputFileName, OpenMode.Output, OpenAccess.Write)
Catch ex As Exception
MsgBox("output File open error " & vbCrLf & ex.Message)
Stop
End Try

'One complete pixel is 120 positions x 2 bytes = 240 bytes
Dim Completepixel As Integer = 240
'One complete row is 640 pixels = 640 * 240 = 153600
Dim CompleteRow As Integer = 153600
' One image is 640 complete rows = 640 * 156300
Dim i As Long = 0
Dim offset As Int64 = 32768 ' should offset be + 1 more ?
Dim Header As String = ""
Dim HeaderString As String = ""
Dim StartofFile As Int64 = 0
' Create the memory-mapped file.
Using mmf = MemoryMappedFile.CreateFromFile(InputImageName, FileMode.Open, "IMGAA", 500000000, MemoryMappedFileAccess.ReadWrite)
Using accessor = mmf.CreateViewAccessor(StartofFile, Length) ' since the length is set to 0, entire file is mapped
Dim ReadValue As Byte
Dim ByteArray(20) As Byte
For x = 0 To 13
ReadValue = accessor.ReadByte(x)
Header = ReadValue.ToString
ByteArray(x) = ReadValue
Next
HeaderString = System.Text.Encoding.Default.GetString(ByteArray)
MsgBox(HeaderString)
DataLine = HeaderString
PrintLine(OutputfileNumber, DataLine)
' suppose we need to find 15band for all the rows and columns
Dim band As Integer = 15
Dim band_value As UShort = 0
For r = 0 To MaxRows - 1
'For c = 0 To MaxColumns
For p = 0 To MaxPixels - 1
i = offset + r * CompleteRow + p * Completepixel + band
Try
accessor.Read(i, band_value) ' 15th value is read, compare if this needed
DataLine = "Band 15 Value at r=, " & r.ToString & ", p= ," & p.ToString & ", i= ," & i.ToString & ", Value = ," & band_value.ToString & ","
' MsgBox(DataLine)
PrintLine(OutputfileNumber, DataLine)
Catch
MsgBox("Exiting at Catch, i = " & i.ToString)
Exit For
End Try
Next
Next
End Using
End Using
FileClose(OutputfileNumber)
Dim d2 As DateTime = DateTime.Now
Dim diff1 As System.TimeSpan
'Console.WriteLine("d2({0})", d2)
diff1 = d2.Subtract(d1)
Trace.WriteLine("diff1 ")
Trace.WriteLine(diff1)

End Sub
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900