| By Basil Voronkov at 2008-10-10 |
| Last versions of .NET Framework have a pretty functional API that allows to work with HTTP and FTP protocols through the convinient object model instead of using sockets directly (which was the only way in the very first .NET Framework version as far as I remember). And yes, this API is currently used by WideStream.
I don't want to discuss a question of its limitations or functionality - that's a separate topic I believe. The thing that I want to describe here is the API architecture.
Basically this network API uses an Abstract Factory pattern which provides a high level of class polymorphism. No need to write almost duplicating chunks of code when everything can be generalized and therefore easily maintained and supported. As a result obviously different protocols can be dealt with in a really similar manner - theoretically you can use the same code to work with HTTP and with FTP as well.
The Factory, represented as a WebRequest class, defines a static non-instance Create method that accepts an URI - and depending on the protocol scheme used in this URI returns either an instance of HttpWebRequest or an instance of FtpWebRequest class, generalized using this super class WebRequest. WebRequest in its turn has a whole listing of virtual methods that are applicable for both FTP and HTTP - you can obtain a request in a synchronous or in an asynchronous manner, set credentials, configure authentication and timeout, choose a proxy and so forth.
That is cool. With no doubt. And really - if your task is to write a small routine that simply grabs something from the "outside" in a protocol independent manner - that is exactly what the doctor ordered. However when we come to really using all these staff the situation becomes not so obvious. To be perfectly honest I doubt that this is a revelation to somebody who have ever dealt with HTTP and FTP as soon as except of a general process workflaw these protocols don't have much in common.
The natural intention or better to say an instinct of a developer who discovers such an API is to follow its architecture and therefore to implement a generic or polymorphic DownloadProcess class. But as soon as there are a lot of actions and a lot of settings specific to FTP and HTTP you will finally find out that most part of this class contains a not very generic code. In other words your program has to ask all the time - Hey, what is this WebRequest thing? Is it actually an FTP or an HTTP request? As a result your code is full of C-casts and methods specific to one of these protocols. Polymorphism, eh?
Of course its not the API whom you should blame, its a developer who didn't oversee these pretty obvious difficulties. But then again if this .NET Framework API is dedicated not only for the simplistic network actions could it be designed differently?
That's a hell of a question. And it doesn't have a direct answer. Of course .NET Framework API is a generic API and this WebRequest class can be used to browse FTP directory as well to download a file through a secure HTTP connection. And if it will be implemented in a less pattern-oriented manner somebody will definetly ask - why don't you use Abstract Factory here?
So what's the conclusion? If there is any other pattern that better fits here?
This post is not about patterns. The quality of software architecture is defined not by the number of patterns that you are using but by the quality of the initial abstractions chosen for your architecture. What .NET Framework designer tried to achieve with this WebRequest class chain? It doesn't seem like a low-level network API - for this goal sockets and probably FtpClient style helper classes will be more than enough. High-level object oriented architecture? A generalized WebRequest and say an FtpWebRequest that may seem a concrete implementation of WebRequest (which is of course true) but at the same is again an abstraction of everything that can be done through FTP protocol. Using a concrete protocol as an abstraction is not the best choice I believe.
How else it can be done?
I'd say that we have a number of workflows that describe the typical tasks and processes. For example, Explorer style workflow. So we can implement an Explorer abstraction and it doesn't matter what exact implementation may be utilized here - it may be file system, FTP, SFTP, WebDAV or whatever. No doubt there are a lot of differences in concrete implementation of file system and FTP explorer but as soon as you have a right abstraction, a right building block, the rest becomes just a manipulation with ready-to-use high level blocks like in a Lego constructor. Remember that you will have Authorization abstraction, Listing abstraction and so forth. Imagine how clean the code will be.
But in reality I am not absolutely fair. I am not trying to design a new framework or something like that - the new Mekong milestone is becoming closer and closer and now I realize that a lot of solutions chosen for Murray... may require some serious attention in Mekong. WebRequest API (including the whole download process architecture) is definetly one of them. |
| By Basil Voronkov at 2008-10-04 |
| When I was working on the last release of WideStream - now Murray Alpha 4 – I found a strange problem with FTP in the program. I was testing FTP explorer and FTP download logic – by navigating to an FTP folder through FTP explorer, selecting a file and sending it by double-clicking to the download queue. Everything worked OK with two FTP servers. But not with the third one.
OK, so I don’t have any problems browsing the folder structure of this third server. The server obviously requires authorization and I am absolutely sure that my account has a full read /write access to everything in it. I am trying to download a file that I used to download many times before and WideStream instead of a file returns me a meaningless 505 FTP error (which means that file is either deleted or you don’t have an access or it was never existed or whatever). OK, that’s may be a bug in the program. As soon as I should have an access to this file I am just trying to double-check that it is really a problem with WideStream. I am copying a link to Internet Explorer. And what do you think happens? It fails.
That is becoming more and more intriguing. I am recalling that I still have ReGet installed on my box. So that will be the last test. If ReGet can’t do it then it is absolutely impossible in spite of the fact that I should have this damn access. Supernatural powers, doom, but this file is not downloadable today. And I am copying a link to ReGet. And after generating a long listing of log entries ReGet finally starts to download it.
What actually happened. I had a bug in an FTP explorer – but only in Windows style listing parser, UNIX listing parser was OK – that was adding a space at the very begging of each file name. It wasn’t easy to notice that in FTP explorer – so I didn’t. As a result when I was double-clicking a file to add it to the download queue the generated link was invalid – and WideStream and MSIE obviously failed to download a file. And why ReGet didn’t? According to its download log ReGet first tries to download a file "as is", receives this known 505 error – but it doesn’t stop him. He is listing the FTP directory where file is placed and after doing some sort of analysis understands that the name of the file is not really right – and removes this space from the begging. After that everything works as expected.
So I believe that is something that I can add to my TODO list. It is not that difficult to implement and in certain situations it will look like your download manager can do real magic. |
| By Basil Voronkov at 2008-10-02 |
I'd like to announce changes to the roadmap. The upcoming release that was planned to be the first WideStream Beta now becomes Alpha 4. It doesn't mean that this release is not stable enough to be called Beta but it is still lacking a lot of important functionality that I suppose to be mandatory for a project to enter the Beta stage.
Beta 1 appears to be a really unlucky release. I was initially planning to release the first Beta right after the Alpha 2 but at the end we will likely have at least 4 alphas.
I analyzed how much time it will take to fully finish all the functionality that should be in place for Beta 1 and decided that it is better to make an early release then to wait a month or two until I will be really ready to call a project a Beta. This functionality includes such features as:
- New algorithm for managing saving download queue definition - now it is incorporated in the program configuration file which is really really wrong. I am planning to store a definition of a download queue as a separate XML file and also to give a user an ability to work with several saved queues - to save and load queue defintions at a run-time.
- Configurable autosave feature for the download queue definition. Now if a program crashes you will loose all information about current downloads.
- Support for continue download feature (pause and resume) for FTP downloads. Currently you can't pause an FTP download - you can only stop and start from the begging which is not good.
Of course nobody will probably use an Aplha version of download manager and I could name this release Beta just because of that but I believe this will be unfair.
Still take into consideration that Alpha 4 is a much more stable release than Alpha 3. A lot of bugs with HTTP and FTP downloads were fixed and for the moment there are no known major issues.
And yes - I hope there will be a Beta soon. Beta 1 will include most part of the features planned for the Murray milestone. |