You will write a program that can be used to retrieve the contents of a file stored on an FTP server without using any existing FTP client program. (In particular, you should not use Python's ftplib.) You are welcome to write your program in any language. Most of you will find Python easiest, because our textbook has information about writing TCP clients in Python. Another likely contender would be Java, for which examples are available from previous editions of our textbook.
For full information on the application-layer protocol you will be using, see RFC 959.
Retrieve the file /pub/cs126/nbody/3body.txt
using telnet to ftp.cs.princeton.edu's port 21. That is,
type the relevant FTP protocol commands (USER
, PASS
, PASV
, RETR
, QUIT
) by hand and, when the response to PASV
comes back, open a second telnet connection (in another window) to the specified data port. For the USER
command, you should specify anonymous
, and for PASS
, you should use anonymous@gustavus.edu
.
You should also repeat the process but this time give the PASS
command a password that starts with -
. Note the simplification in responses that results. This simplification is a non-standard feature of the particular ftp server used at Princeton, so it isn't really good to rely upon, but it may be helpful for getting your program to work sooner.
Finally, it would be be good to try out the NLST
command within the protocol; I particularly suggest
NLST /pub/cs126/nbody/*.txt
This command should be preceded by a PASV
command and the opening of a separate telnet to the data port, just as with RETR
. You can try out both NLST
and RETR
in a single control connection to the ftp server, but each will need to use a separate data connection, so each will need to be preceded by its own PASV
command.
Write a program that opens up a socket connected to ftp.cs.princeton.edu
's port 21 and engages in the appropriate communication so as to retrieve /pub/cs126/nbody/3body.txt
, printing the resulting lines of text on the standard output. This will entail also opening up a second port for the data connection.
Initially, to get something working quickly, you should use a password that begins with -
so that you don't get any multi-line responses, and although you'll need to read in each response before proceeding to the next command, you can assume that the responses from the server all indicate success. Also, at this point in your program's development, you can assume that the ftp server does not expect an ACCT
command after USER
and PASS
. (This assumption is correct for Princeton's server and most others in current use.) Your program at this point will only earn you partial credit because it isn't truly conforming to the FTP specification, but it should print the correct lines of output if all goes well.
Each line of text sent for FTP should end with the carriage return and linefeed characters, which you can include in a Python or Java string as "\r\n"
.
Next, to get closer to full credit, you should upgrade your program to start paying attention to the response codes. However, to keep your life still reasonably simple, you should continue using a password that starts with -
so that you don't get any multi-line responses.
To properly respond, your program only needs to look at the first digit of the response code, as shown in the state diagrams in Section 6 of RFC 959.
If the first digit indicates successful completion of the command, continue with the next step. As a variant on this, the digit may indicate what next step is needed; in particular, after the PASS
command, the first digit of the response code will inform you whether to issue an ACCT
command. Your program should obey that information. If the ACCT
command is called for, you can repeat anonymous
as the name of the account (as well as having been the name of the user).
On the other hand, the first digit of a response code may indicate that something has gone wrong. There are two kinds of things that can have gone wrong: operation failures (such as trying to retrieve a file that doesn't exist) and protocol errors (finding out that the client and server aren't talking to each other properly in accordance with FTP). The state diagrams show which first digits (in the range 1–5) indicate each of these outcomes. Also, if a response line doesn't begin with any digit in the range 1–5, that is in itself a protocol error. Therefore, it suffices to look for indications of success or operation failures, treating everything else as a protocol error.
When either a failure or a protocol error occurs, your program should raise an exception rather than continuing. Best practice would be to define two subclasses of exception, one for failures and one for protocol errors. However, so far as the grading criterion goes, I will accept your program raising a generic exception in both situations, so long as the textual message of the exception includes an indication of either "FTP failure" or "FTP error" together with the actual response line received. In Python, you could do this with a line of code such as
raise RuntimeError("FTP failure: " + response)
The equivalent in Java would be
throw new RuntimeException("FTP failure: " + response)
If you want to be more professional and define two separate subclasses for FTP failures and FTP errors, then the string can just be the server's response line.
To get the final point of credit, your program needs to be able to deal with multi-line responses. To test this (at least partially), you should remove the -
from the start of the password.
For extra credit, change your program so that within a single FTP control connection, it first uses the NLST
command to list all the files of the form /pub/cs126/nbody/*.txt
, then uses the RETR
command to retrieve one of them selected at random. (Here's a Python programming reminder: to randomly choose an element of a list, you can use the random.choice
function, assuming you import
the random
module.)
Upload your program's source code using moodle.