In my testkernel program I'd like to walk directory trees via a variety of protocols. What I think I want is something like os.walk
, but which works for ftp, and for typical http directory listings also (like http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.2-precise/). This is in the spirit of openanything.py
For FTP walking I found several options, including ftptool, and the ftputil module which has the advantage of being in Ubuntu. I've already implemented my own very simple recursive walk of http directory listings, using Beautiful Soup. But before I combine them together with os.walk
, I wonder if it has been done already.
I know the semantics of http walking are not well-defined like they are for file systems and ftp, so I guess I'll have to guess that directories are indicated by a URL with a trailing slash which extends the URL of the directory. And I'll have to be careful to avoid infinite walks. But even for a subset of os.walk (e.g. only topdown), this sort of thing seems useful.
Has this been done? Any advice?