Protein structures in PDB format can be downloaded from The Protein Data Bank.
Here is an example of the PDB format (a line of numbers at the top is for marking columns only; it is not part of the PDB file):
12345678901234567890123456789012345678901234567890123456789012345678901234567890 ATOM 1 N ASP E 1 36.400 74.848 33.180 1.00 50.94 N ATOM 2 CA ASP E 1 36.906 75.294 34.515 1.00 50.47 C ATOM 3 C ASP E 1 35.879 76.187 35.223 1.00 49.81 C ATOM 4 O ASP E 1 34.970 75.684 35.893 1.00 50.55 O ATOM 5 CB ASP E 1 38.248 76.022 34.359 1.00 50.93 C ATOM 6 N GLN E 2 36.015 77.505 35.050 1.00 48.24 N ATOM 7 CA GLN E 2 35.110 78.493 35.659 1.00 46.03 C ATOM 8 C GLN E 2 33.746 78.491 34.960 1.00 44.06 C ATOM 9 O GLN E 2 32.850 79.274 35.320 1.00 44.42 O ATOM 10 CB GLN E 2 35.700 79.914 35.555 1.00 47.00 C ATOM 11 CG GLN E 2 37.155 80.078 36.028 1.00 47.43 C ATOM 12 CD GLN E 2 38.178 79.456 35.064 1.00 48.02 C ATOM 13 OE1 GLN E 2 37.851 79.116 33.909 1.00 47.98 O ATOM 14 NE2 GLN E 2 39.424 79.298 35.538 1.00 48.25 N ATOM 15 N PRO E 3 33.602 77.600 33.973 1.00 41.02 N ATOM 16 CA PRO E 3 32.391 77.479 33.163 1.00 37.84 C ATOM 17 C PRO E 3 31.095 77.246 33.944 1.00 34.34 C ATOM 18 O PRO E 3 31.037 76.403 34.849 1.00 34.24 O ATOM 19 CB PRO E 3 32.580 76.398 32.082 1.00 39.10 C ATOM 20 CG PRO E 3 31.488 76.381 31.000 1.00 40.44 C
The fields in the PDB file are: record type, atom number, atom name, residue name, chain ID, residue number, (x, y, z) coordinates, occupancy, temperature factor, and atom type.
Note that the chain ID is at the 22nd column (which is "E" in the example). In some PDBs, the 22nd column is blank; in these cases, you should leave an underscore "_" as chain input.
In meta-PPISP, only lines that begin with 'ATOM' are used. In addition, records after (x, y, z) coordinates are irrelevant for prediction purpose.