Click here to Skip to main content
15,879,096 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi, can anyone give me a code for comparing >500 text files(pdb) against another pdb file and filter mismatched files in python? I am new to programming. I have got as a result of executing a software some pdb files. My task is to match them against one pdb and filter whichever do not match. The software is written in python.The outputs are structure of molecules that are stored as pdb files. I need to match hundreds of these structures against a single structure and whichever do not match I have to filter those pdb files.

Sorry, I previously misunderstood pdb files and screwed up my question.
My pdb files has all the atoms values etc its like a text file.
Updated 18-Jan-10 10:22am

You edited your post to add 'I am new to programming' and ignored the questions I asked you. I'm pretty sure I can safely say that this means that the task you're asking about, you have no idea how to quantify, which only increases the odds that what you want to do, is not possible with Python, and in general beyond what you're likely to be capable of. I suggest you reconsider the task at hand, or perhaps define it properly for us, because if you can do what you want at all ( and I doubt you can ), I am sure it won't be with Python.
Share this answer
I dint get any reply from you previously. I have just got one reply from you. Anyways, I am familiar with c and C++. I am just like that stuck. I do not know where to start and how.I have got as a result of executing a software some pdb files. My task is to match them against one pdb( structure of a molecule) and filter whichever do not match.

Share this answer
Yes, you did, and if you'd registered with a real email address, you'd have got an email. The answers section is for answers, not questions, not yours or mine. You should edit your post to add detail. I posted in the forum below to ask you what you're trying to do.

You say you're new to programming, and now you say you know C and C++, how is that possible ?

PDB is not an image format. It's a database format. If you're searching 500 databases to compare against a main DB, why would you use python to do that ? I have written code that links a Win32 database to Python, so it's possible to write C++ code and have python execute it. The question is, why do it that way ?
Share this answer
I am familiar with c, c++ thats about it. Not in good that too :-). Why is it not possible to write in python? because the software that I used to create those image files as output, is completely written in python. Thats why I was wondering why is it not possible to write a compare anf filter in python?

Share this answer
PLEASE edit your post, instead of posting fake answers. A pdb file, as far as I can tell, is a Microsoft specific database format. Python is a scripting language. It seems to me like an odd choice, at least. What is the PDF file exactly ? Is it a database ? If you created them in python, then surely you can examine them in python ? I'd have thought you'd need to interface to C++ to use a Microsoft specific DB format.
Share this answer
Christian, the PDB extension can be used for various file formats, including some image formats. Although, the term "image" can mean different things too, such as a visual image or a snapshot of a database or file system.

To the questioner, please clarify what you mean exactly by "PDB image files". Also, would you like to compare the file names, file data, or the internal contents of the database snapshot (if you are in fact talking about a database "image")?

If you just want to compare file names, that's easy. Just get the name of the file you are comparing against and get the names of all the files in another directory and use a for loop to compare those filenames to the primary file name.

The process would be similar if you are comparing file data. Instead of comparing file names, you just load up the file data and compare that instead.

If you would like to compare data inside databases, then you'll want to use some SQL to query the PDB files and extract the data from them. Once you extract the data, you can compare it just as you would the file names or file data.
Share this answer
I thought so, that's why I posted my question below, which I keep pointing the OP to, so he can provide some sort of useful information. 'structure of a molecule' could mean a representation in a DB, or an actual image. As he keeps using the word 'image', I assume he has pictures and I assume there's no way that python will be able to compare these, and if they are not 100% identical, that there's no way the OP is going to be able to attempt this task.
Share this answer
I doubt thatthis[^] is a copy and paste solution, but it's one of many google hits I got on how to compare two text files using Python.
Share this answer

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900