How to read data and store them to a 2D array with numpy

Question

0.00/5 (No votes)

See more:

I want to store the different lengths data from different files in a 2D array.

DataFile:
data1.txt
1,2,3,4

data2.txt
1,2

data3.txt
1,2,65,7,8,9,0,5,4,8,3,43
....

Array:
x=numpy.array([])

x[number_of_File][data]:
x[0]----output data1
x[2]----output data2
....

What I have tried:

I can use the list to implement this function as follow,

Python

x = []
try:
    nameOfPath=["data1.txt","data2.txt","data3.txt",.....] # Names of pathFile can be defined in a document with .txt format
    for each_item in nameOfPath:
        with open(each_item, "r") as dataFile:
            x1 = []
            for each_line in dataFile:
                x1.append(each_line.split(","))
            x.extend(x1)    

except IOError as e:
    print(e)

however, I want to implement this function with the numpy.array.
I try it many time, it does not work.

For example:
I used np.vstack

Python

x = np.vstack([x, x1])

the Error shows that

all the input array dimensions except for the concatenation axis must match exactly

the array x and x1 dimensions are not same.

So, if I still want to use the numpy.array, how should do to implement it?

thank you!

Email: gz.geophysics@outlook.com

Posted 10-Dec-17 0:26am

Zhang, G.

Updated 17-Dec-17 2:19am

Add a Solution

Comments

Richard MacCutchan 10-Dec-17 7:36am

The error message is clearly telling you that you cannot do it if the dimensions of the individual arrays are different. You could create some simple arrays first with the data, then normalise them to the length of the longest.

Zhang, G. 10-Dec-17 18:20pm

In fact, the list can accomplish this function.
however, the numpy.array has more attribute and functions to use. so I still want use the array to store the data.

Richard MacCutchan 11-Dec-17 2:57am

Fine, but you must still follow the rules.

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

VISWESWARAN1998 · Accepted Answer · 2017-12-10T06:31:00

I have quickly skimmed your code and noticed the following things, You are reading a text file and splitting with "," I think you should be using CSV file instead of .txt file which will be easier for processing.

Since you didn't specify what you are trying to achieve in this code I made few assumptions.
I see some words like DataFile, Data, etc., so I guess you are performing some data analysis, In that case you should go for pandas to read the data frames rather processing it on your own since there are various other concerns like pre-processing etc.

Here is an example file of yours,

DataFile:
data1.txt
1,2,3,4

I have converted this into csv named data.csv

Here is the code:

Python

import pandas as pd
import numpy as np

if __name__ == "__main__":
    data_frame = pd.read_csv("data.csv", header=None) # since you didn't specify header in your question
    np_array = data_frame.iloc[:, :].values # [:, :] => [rows, columns]
    print(np_array)  # print the numpy array
    print(np_array.ndim)  # print the dimension

which will output this,

[[1 2 3 4]]
2  => dimension

Refere more here:
numpy.ndarray.ndim — NumPy v1.13 Manual[^]

And one more thing, numpy has a method named "reshape" which can reshape the array for the dimensions you want.

Zhang, G. · Accepted Answer · 2017-12-17T02:19:00

I find a solution to accomplish this functions.

examples\data1

1   2   3

examples\data2

4   5   6   7   8   9   10

examples\data3

11  12

Python

x = []
try:
    nameOfPath = ["examples\data1", "examples\data2",
                  "examples\data3"]  # Names of pathFile can be defined in a document with .txt format
    for each_item in nameOfPath:
        with open(each_item, "r") as dataFile:
            x1 = []
            for each_line in dataFile:
                x1.append(each_line.split(","))
            x.extend(x1)

except IOError as e:
    print(e)

import numpy as np

np_x = np.array(x)
print(len(np_x))
print(np_x)

output:

3
[['1   2   3']
 ['4   5   6   7   8   9   10']
 ['11  12']]

How to read data and store them to a 2D array with numpy

2 solutions

Solution 1

Solution 2

Add your solution here

Preview 0