Complete newby here. I'm trying to parse a quite long simulation output with Python into a frame and write it into an excel sheet.
I only want to parse certain entries, not the whole thing. (See my code below)
The output I am trying to parse:
F100.T,557.9567856878748,F,F,F,F,
F100.Tv,557.9567856878748,F,F,F,F,
F100.Tl,557.9567856878748,F,F,F,F,
F100.Duty,-106382.60618934222,F,F,F,T,1
...
F200.T,557.9567856878748,F,F,F,F,
F200.Tv,557.9567856878748,F,F,F,F,
F200.Tl,557.9567856878748,F,F,F,F,
F200.Duty,-37798.28473117316,F,F,F,T,1
... and so on
How it should look like at the end:
| F100 | F200 |
----|--------------------|----------------|
T | 557.9567856878748 | 100 |
Tv | 557.9567856878748 | 5.550847203 |
T1 |-106382.60618934222 | 3.798721561 |
... and so on.
What I have tried:
import itertools
import pandas as pd
def read_lines(file_object) -> list:
return [
parse_line(line) for line in file_object.readlines() if line.strip()
]
def parse_line(line: str) -> list:
return [
i.split(",")[1]
for i in line.strip().split()
if i.startswith(("F100", "F200"))
]
def flatten(parsed_lines: list) -> list:
return list(itertools.chain.from_iterable(parsed_lines))
def cut_into_pieces(flattened_lines: list, piece_size: int = 2) -> list:
return [
flattened_lines[i:i + piece_size] for i
in range(0, len(flattened_lines), piece_size)
]
with open("sim.txt") as data:
df = pd.DataFrame(
cut_into_pieces(flatten(read_lines(data))),
columns=["F100", "F200"],
)
print(df)
df.to_excel("table.xlsx", index=False)
But it looks like this:
F100 F200
1 557.9567856878748 100
2 557.9567856878748 5.550847203
3 -106382.60618934222 3.798721561
.. ... ...
As you see, the rows are not named (T, Tv, T1 etc.). Im hitting a wall and don't know how to continue from here.
Thanks in advance