Latest Entries »

When a Customer wants to upgrade to a newer version of Cincom® ObjectStudio® or Cincom® VisualWorks®, I and other members of our Smalltalk Support team here at Cincom will suggest that they review the Release Notes—not only for the version that they are upgrading to, but also for all the releases that came after their current version. While this will give customers an overview of the things that have changed, the individual specific changes usually can only be seen by reviewing the actual changes in the code that happened between the versions.

In Chapter 2 of our Source Code Management Guide, we suggest publishing the VisualWorks base code into a company’s Store repository.¹ If that is done, the Package Comparison tool can be used during development to see if any changes or overrides have been made to the base code. As an added benefit, the Package Comparison tool can then be used to review the changes that have happened between versions of VisualWorks.

During many upgrade projects, there are phases when it is beneficial to include a review of what base code has changed, was added or removed. Focusing on just those individual types of changes at any one time can be difficult in the Package Comparison tool because it combines all of the types into one presentation.

To help reduce the friction during that phase of an upgrade project, I’m introducing the Filtered Store-Code Comparison Contributed Package for Cincom ObjectStudio 8.4 or Cincom VisualWorks 7.9 and above.

Click for a lager Image of Package Comparison Tool with the new options.

When loaded, this package adds three Checkboxes to the Package Comparison Tool that allow the user to control what types of comparisons are to be shown or hidden:

Image of the new options at the bottom of the Compare Packaged tool.

When any of these new options are turned on (checked), the comparison view will be updated so that those types of comparisons are no longer shown. The hidden comparisons can be shown again by turning off any of the options (unchecking a checked option).

The new options are made possible in part by wrapping the existing Tools.PackageComparisonTool within instances of PackageComparisonToolWithFiltering—a new ApplicationModel provided by this package. Doing this required overriding a couple of existing methods in the image, but I tried to keep the number of overrides to a minimum. The majority of the functionality was enabled by adding and keeping flags in the properties dictionary that already exists in every instance of VisualPart.² Also, since not every kind of AbstractComparisonRollupView kept track of what kind of comparison it represented, the code has been extended so that every comparison view can answer if it represents an addition, a change or a removal.

If you wish to explore how the extensions and overrides work, additional technical details can be found in the comments for the package and in all of the added methods. I’ve done my best to explain my design decisions, along with noting any compromises, which I felt were needed.

The current version of the Filtered Store-Code Comparison package can be downloaded from the Contributed Components section of our website:

http://www.cincomsmalltalk.com/main/community/product-portal/contributed/


1. Publishing the base code is recommended in “Chapter 2 − Beginning to Use Store” of our Source Code Management Guide documentation. (Overall page 27 of 168 in the latest version of the document.)

2. The properties dictionary was first added as an instance variable of VisualPart in VisualWorks 7.6. For more information, see the Release Notes for 7.6 or the instance variable documentation in the class comments of VisualPart.

Posted by: James T. Savidge

Advertisements

Introducing BrowseOverrides

The BrowseOverrides is a new contributed package designed to help users quickly review any methods in an image that have been overridden by extensions in other packages.

While working on the Senders of Deprecated browser, our customers have told me and other members of our Smalltalk Support team here at Cincom that they have difficulties in managing the overridden methods in their images. Because the existing Override Editors¹ are restrained by the limitations of the Change List Tool², most of our customers work with overridden methods within the confines of the standard Refactoring Browser.

When a method is overridden in the code browsers, we only show the code for that method in the package that contains the most recently loaded or edited version of the override. Some of our customers agree that this is the right thing to do, but others want the overridden version of the methods to also be shown in the package(s) where the original methods resided. Our Engineering team is in the process of finding a way to give our customers something that will accommodate all of those needs, but it may take a while before that work is ready to be released.

To provide some short-term help in this area, I’m introducing the BrowseOverrides Contributed Package for Cincom® ObjectStudio® 8.1 or Cincom® VisualWorks® 7.6 and above. Included are extensions to the VisualLauncher and Override classes that are designed to help you quickly review any methods in your image that have been overridden by extensions in other packages.

Loading the package will add an Overridden Methods submenu and submenu items to the “Browse” menu in the Launcher window:

Click for a lager Image of the Overridden Methods Browser Menu

Each of the submenu items will open a method list browser on a list of methods based on the following criteria:

  • All          -> Any methods that have an Override.
  • Single     -> Any methods that have only one (a single) Override.
  • Multiple  -> Any methods that have more than one Override.

Click for a lager Image of the OverriddenMethods BrowserSince a standard method list browser is used, the code editing pane will have an active red-colored “Overridden” tab for each of the methods in the list. This tab has the same functionality of the “Overridden” tab in the code editing panes of the standard Package/Class Refactoring browsers. Please note that the look and functionality of the “Overridden” tab will be somewhat different in versions of VisualWorks prior to 7.7.1.³

If you wish to explore how the extensions work, additional technical details can be found in the comments for the package, and in all of the added methods. I’ve done my best to explain my design decisions along with noting any compromises I felt I had to make along the way.

Starting on December 9, the current version of the BrowseOverrides package can be downloaded from the Contributed Components section of our website:

http://www.cincomsmalltalk.com/main/community/product-portal/contributed/


1. Override Editors are described in “Chapter 3 – Override Editor” in our Tool Guide documentation.

2. Change List Tools are described in “Chapter 5 – Change List” in our Tool Guide documentation.

3. Please see the “Override Code Tool gets a facelift” section from the Release Notes for VisualWorks 7.8 , Page 1-20, about the Override tab in the code browsers.

Posted by: James T. Savidge

Introducing the Senders of Deprecated Browser

The Senders of Deprecated browser is a new contributed package for Cincom® ObjectStudio® 8.6 or Cincom® VisualWorks® 8.0 and above.Click for a lager Image of the Senders of Deprecated Browser

I created it to help developers track down, evaluate, and hopefully eliminate any places in their code that call methods which have been marked as deprecated by the Engineering team at Cincom, or by anyone else.

After loading the package, the browser can be opened by picking the “Browse -> Senders of Deprecated” menu item from the VisualWorks Launcher window, or by executing the following code:

    Smalltalk.DeprecatedSendersBrowser openDeprecatedSenders.

As the browser opens, it searches the image to collect all of the methods that send >>deprecated:. It then attempts to find the methods that call any of those deprecated methods. The list of deprecated methods is displayed in the upper list pane, and when one of those methods is selected, the second list pane below it is filled with the methods that are believed to be senders of the deprecated method in the upper list. If a method in the second list pane is selected, then below it is a standard code pane where the source of that sending method is shown and can be edited.

Because of the inherent performance compromises in MethodCollector and MethodFilterReference, there will be some false positives in the list of methods collected. Methods in the lower list pane can be removed from that list by using the standard “Method -> Remove From List” menu item. Deprecated methods in the upper list pane can be removed from that upper list by using the new new “Deprecated -> Remove From List” menu item. (If there are any senders still left in the lower list pane for the selected deprecated method, then those will be removed from the lower list pane at the same time.)

Image of the new Deprecated menu

The new Deprecated menu is much like the existing Method menu, but the items have been pared down to items that make sense for a method list which does not have a corresponding code pane where source can be viewed and edited.

If you wish to explore how the browser works, additional technical details can be found in the comments for the package, the classes, and in all the methods. I’ve done my best to explain my design decisions, along with noting any compromises I felt I had to make along the way. Hopefully, you can jump into any place that interests you, and the comments should help you understand how it is connected to other parts of the browser, and to the rest of the system.

The current version of the BrowseDeprecated package can be downloaded from the Contributed Components section of our website:

http://www.cincomsmalltalk.com/main/community/product-portal/contributed/

On a historical note, I started work on this browser when one of our Customers wrote about how difficult and time-consuming it was for them to verify that none of their code calls any of our deprecated methods. I targeted this browser to make it easier to focus on and complete that process for any ObjectStudio or VisualWorks project.

There are numerous improvements that can be made to the browser, but I would like to get some feedback from real users before deciding what to work on first. After working with the new browser, please tell me what features are useful to you, and what your team needs to make it even more valuable during your projects.

Posted by: James T. Savidge

Xtreams at Work

Lately I’ve been working on an Xtreams based implementation of SSL/TLS protocol and I think that some specific parts of that could make interesting examples. Examples that are non-trivial, real-life solutions rather than some artificial academic exercises. This is my first attempt to extract an interesting example, hopefully without getting bogged down in too much detail. Although I will try to explain the constructs used in the example, it will require some basic familiarity with Xtreams, the documentation at http://code.google.com/p/xtreams can be used as a reference.

As you may know the SSL/TLS protocol is meant to protect other, higher-level protocols from eavesdropping and tampering. The data payload is simply a stream of bytes, the semantics of it are not relevant to SSL/TLS at all. The payload gets split into chunks called records and each record is then individually encrypted and signed to provide the required protection. There is other traffic beyond just the data on an established connection. There are handshake messages for establishment of session keys, etc. Handshake normally happens at the beginning, but can also happen again later on a long lasting connection, generally to allow refreshing the keying material for improved security. There are also alerts, that allow one side to warn the other about certain conditions (e.g. that the connection is about to be closed). Overall there are four different types of payload that can be carried by a record. The rule is that a single record can carry only one type of payload. However there are no rules about how the payload is partitioned between records. A handshake record can carry several handshake messages inside or a single handshake message can span several handshake records. However a single record cannot carry both handshake messages and data. From the point of view of the data payload the record boundaries are irrelevant and should be completely transparent. Consequently, the most straightforward way to present the data payload is as one continuous stream of bytes. Of course the interleaving chunks of non-data payload need to be filtered out and handled accordingly.

Problem definition

To make the solution a bit less convoluted let’s simplify the problem a bit. Let’s say there aren’t four types of traffic but just two, data and non-data. We also won’t worry about what needs to happen with the non-data, let’s just log it in a separate stream. To have some concrete samples to work with we need to define the record structure. Let’s say a record starts with a boolean indicating if the record carries data or not, then another integer specifying the size of its contents and then the contents itself.

Let’s generate some random samples to work with. Let’s say our records will be anywhere from 0 to 9 bytes long and the contents will be always the sequence from 1 to <size>. If size is 0 the contents will be empty. First let’s make a random generator that will generate an integer between 0 and 9.

random := Random new reading collecting: [ :f | (f * 10) floor ].

The part “Random new reading” yields a stream that exploits an instance of Random to generate random floats in range 0 <= x < 1. The collecting part transforms each float into an integer between 0 and 9. With this we can generate a random sample of 10 records as follows.

sample := Array new writing.
10 timesRepeat: [ | size |
	size := random get.
	sample put: random get even;
		put: size;
		write: (1 to: size) ].
sample close; terminal.

The ‘Array new writing’ bit creates a simple write stream over an Array, as usual, the array will automatically grow as elements are written to it. Closing a collection write stream will trim the underlying collection to the actually written size and the #terminal message returns it (it’s called terminal because it will return whatever is at the bottom of an arbitrarily high stream stack). Here’s one sample generated by the above code.

#(false 4 1 2 3 4 true 4 1 2 3 4 false 3 1 2 3 false 5 1 2 3 4 5 true 2 1 2
 false 8 1 2 3 4 5 6 7 8 true 3 1 2 3 true 3 1 2 3 false 0 false 4 1 2 3 4)

In further text when we refer to ‘sample’ we mean a read stream over such an array, which is created simply by sending #reading to the array.

Fragments

So we start with a simple stream and we need to parse the records out of it. That is actually quite simple.

fragments := [ | size |
		isData := sample get.
		size := sample get.
		(sample limiting: size)
			closeBlock: [ :stream | stream -= 0 ];
			yourself
	] reading.

Sending #reading to a block creates a block stream, which produces each element by running the block, the element is the result of the block run. Now let’s take a look at what the block produces. It gets a single element from the sample stream and puts it in variable isData. Assuming the sample stream is aligned with beginning of a record, the first element should be the Boolean indicating the type of the record. Then we get another element from the sample stream, the record size, and use it as a parameter of the #limiting: message. Sending #limiting: to a stream creates a “virtual” substream. The actual contents of the substream come from the underlying stream, the substream just makes sure we don’t read more than the specified limit. The closeBlock is there to make sure that when we close the substream the underlying is positioned at the end of it, i.e. at the beginning of the next record. The argument to the closeBlock: is the substream itself and the expression “-= 0” seeks to the end of itself (read seek 0 bytes from the end of the stream). So the result of the block is a virtual stream of the payload of the current record (called fragment in the SSL/TLS spec). That means that fragments is a stream of record payload streams and global variable isData indicates the type of the current record (i.e. the most recent one we read from fragments).

Runs

Now we can treat each fragment as a stream, but we know that the fragment boundaries are meaningless. If we have an algorithm that parses a handshake message from a stream, we can’t give it a fragment, because the message might be continuing in a subsequent fragment. What we need is to be able to treat a sequence of adjacent fragments of the same type as a single continuous stream, let’s call it a “run”. As it happens, Xtreams has a handy construct to combine several streams into one, it’s called “stitching”. It takes a stream of streams and makes it look like a single continuous stream. For example, the following two expressions yield the same result.

(1 to: 10) reading
(Array with: (1 to: 3) reading with: (4 to: 7) reading with: (8 to: 10) reading) reading stitching

Conveniently, fragments is a stream of streams, however we don’t want to stitch it all together. We want to stitch just the adjacent fragments of the same type. So, the result will still be a stream of streams, it’s just a stream of runs, rather than stream of individual fragments. To be able to stitch fragments together we need a stream that will keep handing out fragments while the type is the same and ends when the type changes.

[	fragment := fragments get
	runType ifNil: [ runType := isData ].
	isData = runType
		ifTrue: [ fragment ]
		ifFalse: [ Incomplete zero raise ]
] reading stitching

We get a fragment, if it’s the first one, we remember its type and we keep returning fragments, until the type changes. When that happens we raise the Incomplete exception, which signals the block stream that it ended, which in turn signals the stitching stream that it ended as well. This however won’t work, because it will consume the first fragment of the following run. We need to rewrite it a bit differently so that the non-matching fragment can be carried over to the next run. We’ll rewrite the construct so that the first fragment of a run is obtained outside of the run stream itself and brought in via an external variable, fragment. That way the first fragment of the next run can be fetched by the final iteration of previous run.

fragment := fragments get
runType := isData.
[	isData = runType
		ifTrue: [ fragment closing: [ fragment := fragments get ] ]
		ifFalse: [ Incomplete zero raise ]
] reading stitching

The first fragment is read outside of the block stream, we may as well capture the runType at that point. The block stream then simply compares the current isData value to the captured runType, if it matches it returns the current fragment. The difficulty here is that we need to get next fragment after the previous one was read. The best way to achieve that is to fetch the next one in a close block of the previous one. Again if the type changes we raise Incomplete, to signal the end of the run.

Now that we know how to build a single run stream, we need to wrap that up in a stream of run streams. A simple block stream returning the stitched run streams should suffice.

fragment := fragments get.
runsFinished := false.
runs := [ | runType |
		runsFinished ifTrue: [ Incomplete zero raise ].
		runType := isData.
		[	isData = runType
				ifTrue: [
					fragment closing: [
						[ fragment := fragments get ] ifCurtailed: [ runsFinished := true ] ] ]
				ifFalse: [ Incomplete zero raise ]
		] reading stitching.
	 ] reading.

The tricky part is ending the stream. A block stream will keep running the block (whenever it is asked for an element) until a block run raises an Incomplete. Here we want this to be the moment when we run out of fragments, i.e. when getting the next fragment raises an Incomplete. However that action is buried inside the close block inside the stitched stream of a run. When it happens the stitched stream will re-interpret that as the end of itself, so the outer block stream cannot distinguish an end of a run from the end of fragments (it is the end of the last run as well after all). So somehow we need to capture the fact that a fragment get raised an Incomplete and bring that information up to the block stream. That’s what the runsFinished variable is for. Without that the runs stream will keep giving out empty runs forever once it runs out of fragments.

To summarize this step, the “runs” stream turns the “fragments” stream into a stream of runs where adjacent fragments of the same type are stitched together into a single continuous stream, a run. With our sample input we should get following result.

runs collect: [ :r | r rest ]
=>
#(#(1 2 3 4) #(1 2 3 4) #(1 2 3 1 2 3 4 5) #(1 2) #(1 2 3 4 5 6 7 8) #(1 2 3 1 2 3) #(1 2 3 4))

Data vs Control

Now that we have a stream of alternating data and non-data runs, we need to to stitch the data runs together and log the non-data runs into a separate stream. For that we just need a simple block stream that gets a run, if it’s not data, log it and get next one.

control := ByteArray new writing.
data := [ | run |
	run := runs get.
	isData ifFalse: [
		control write: run.
		run := runs get ].
	run
	] reading stitching.

With our sample we should get following results.

data rest
=>
#(1 2 3 4 1 2 1 2 3 1 2 3)

control close; terminal
=>
#[1 2 3 4 1 2 3 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3 4]

Note that the whole processing happens behind the scenes as we’re reading from the data stream, the “data rest” bit simply reads everything from the data stream. It happens lazily as data is being read, nothing is being cached, so the performance characteristics shouldn’t change even if we pump megabytes of data through it. In fact we can easily rewrite the fragment stream so that the sample is generated on the fly.

sample := Array new writing.
fragments := [ | size |
	sample put: (isData := random get even).
	sample put: (size := random get).
	(1 to: size) reading ] reading.

Here the “sample” stream is used just to log what we’ve generated, so that we can verify that the results are correct. We only log the type and size, the contents are implicit. If we use this version of fragments, we can’t call #rest on the data stream because the fragments never finish it will just keep reading forever. Here’s a sample run where we read a 100 data bytes instead.

data read: 100
=>
#(1 2 1 2 3 4 1 2 3 4 5 1 1 2 3 4 5 6 7 8 9 1 2 3 4 1 2 1 2 1 2 3 4 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 1 2 3 1 2 1 2 3 4 5 6 7 1 2 3 4 5 6 1 2 3 4 5 6 7 1 1 2 3 4 5 6 7 1 2 3 4 5 1 2 1 2 3)

sample close; terminal
=>
#(false 6 true 2 false 3 true 4 false 7 true 5 false 5 true 1 false 1 true 9 false 7 true 0 false 8 true 4 false 2
false 8 false 4 true 2 true 2 true 0 false 5 true 4 true 6 true 0 true 9 true 9 false 1 false 6 true 3 false 4 true 2
false 5 true 7 false 7 false 8 true 6 false 8 false 5 false 6 false 2 false 5 false 6 false 4 false 0 true 7 true 1
false 1 true 7 true 5 true 2 false 0 false 1 true 5)

control close; terminal
=>
#[1 2 3 4 5 6 1 2 3 1 2 3 4 5 6 7 1 2 3 4 5 1 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 1 2 1 2 3 4 5 6 7 8 1 2 3...etc...]

We can also easily profile arbitrarily large run. Let’s rebuild the fragment stream so that it doesn’t log the samples, otherwise it will keep growing and skew the results unnecessarily. Similarly let’s turn the control log into a bit bucket too.

control := nil writing.
TimeProfiler profile: [ nil writing write: 10**7 from: data ]

Here’s a time and allocation profile summary from reading 10MB of data (and about as much control data, assuming reasonably non-biased random generator).

Time

2083 samples, 17.22 average ms/sample, 4023 scavenges, 0 incGCs, 
5 stack spills, 0 mark stack overflows, 0 weak list overflows, 0 JIT cache spills
34.82s active, 1.0s other processes,
35.87s real time, 0.05s profiling overhead
** Totals **
28.8 Context>>findNextMarkedUpTo:
9.6 Context>>terminateTo:
6.5 BlockClosure>>on:do:
3.8 GenericException class>>handles:
3.1 SequenceableCollection>>replaceElementsFrom:to:withSequenceableCollection:startingAt:
2.8 BlockClosure>>cull:
2.4 ResolvedDeferredBinding>>value
2.3 MarkedMethod>>isMarkedForHandle

Space

1394487 samples, 1045 average bytes/sample, 6760 scavenges, 0 incGCs, 
2 stack spills, 0 mark stack overflows, 0 weak list overflows, 0 JIT cache spills
1458037980 bytes
** Totals **
41.7 GenericException class>>new
19.7 Xtreams.ReadStream class>>on:
17.6 LaggedFibonacciRandom>>nextValue
11.1 [] in UndefinedObject>>unboundMethod
6.2 Interval class>>from:to:by:
3.7 Xtreams.StitchReadStream class>>on:first:

To put the results in some perspective, the 20MB of records of average size 5 (0 to 9) means we’ve processed about four million records, each as a stream of its own with bunch of other virtual streams set-up on top. There were lots of lightweight, short-lived objects created in the process: the streams, an Incomplete exception at the end of each stream, the sample intervals representing the contents of each fragment, etc. Apparently the floats generated by the random generator are a significant portion of the profile as well. The profile says that we went through 1.5 GB of objects which, frankly, is a bit more than I’d expect, but the good news is we didn’t trigger single incremental GC, it was all handled within the scope of new space. Remember also that the total includes sample generation as well, it seems that we can safely attribute at least a quarter of the space cost to that. Either way the runtime image size didn’t spike at all. With a normal SSL/TLS connection where record size goes up to 16K, the same amount of overhead should easily cover 50GB of payload (probably more). So the cost of this rather powerful abstraction, which completely hides the underlying protocol, should be quite reasonable and easily dwarfed by all the other overhead on a typical SSL/TLS connection (encryption, IO, etc).

PS: In case you wonder, I did verify the claim that the amount of control payload roughly corresponds to the amount of data payload. For that I used the handy monitoring stream. I wrapped it around the control stream as follows.

control := nil writing monitoring: [ :total | nonData := total ] every: 1 seconds

The stream runs the monitoring block at the specified intervals providing some optional handy arguments, first of which is the total number of elements that went through the stream. After the profiling run I simply inspected the value of nonData and it was where it should have been, well within 2% of 10M.

– ostPed by Martin Kobetic

Xtreams-SSH2

We are receiving some very encouraging feedback on Xtreams . There’s a fairly complete port to Squeak/Pharo , people are blogging about it , and discussing it on various forums. All that is very welcome and certainly helps reassuring Michael andmyself that we might be onto something that’s worthwhile and keeps us motivated to continue.

At this stage of the game we feel that the core library is reasonably complete, we’re reasonably happy with the API and we’re venturing into experiments where we’d like to prove that the concepts and implementation are good and that the performance goals are achievable as well. Michael created a neat, very light-weight, yet rather complete IRC client over a weekend. My fascination with security protocols led me to attempt an implementation of SSH2 .

I chose SSH because I wanted to learn more about the protocol, and wanted to compare it with my previous experience implementing SSL (outside of the context of Xtreams). I also see it as a good target for validation of our performance goals. Secure protocols are naturally layered and that seems to be a rather good fit for an attempt to map that structure onto a stream stack with the socket connection at the bottom, various packet splitting/combining, encryption and hashing layers on top of it, all hopefully coming together into a very simple and transparent binary stream facade. If the abstractions and implementation is right, the stack must behave the same as a simple binary stream and it must not cost much in terms of performance.

So, I’ve been working on this in my spare time for about 2 months now. It was a bit more work than I expected, not because I’ve hit some particularly difficult obstacles, in fact I was making fairly steady progress throughout, I just didn’t really know what I was getting into. SSH is really quite a bit more than just a protocol. It’s a suite of protocols (some documented better than others) with an architectural framework that puts them together for a particular purpose: running a remote shell, executing remote commands, uploading/downloading files, etc. You can’t reasonably compare SSH to SSL as a whole, that would be comparing apples to oranges, or rather comparing apple to an apple pie. The part of SSH that is roughly comparable to SSL would be the bottom-level transport layer combined with the authentication layer running on top of it. That’s all nice and dandy and I was done with that part in a few weeks, but the problem is, you can’t really use it for anything practical. You could use it for a custom, smalltalk to smalltalk, secure communication channel, but you can’t interoperate with anything else out there.

What I wanted was to be able to upload/download a large file with smalltalk on either the client or server end and be able to measure how long it takes compared to a native C client, like OpenSSH’s scp command. So, to get there I needed to also implement the connection layer , which provides the multiplexed multi-channel capabilities allowing independent data-flows over a single, shared, secure connection. Then I needed to figure out how the scp command uses those facilities to transport files over it, which involved a rather sparsely and incompletely documented SCP protocol and a good deal of trial and error experiments with OpenSSH.

So I’m glad to report that I’m finally seeing the light at the end of the (encrypted) tunnel. At this point I can execute an expression on a Smalltalk server via the ssh command. There isn’t any generic TCP server support built in, so the first half of the example code is just to establish a single TCP connection with a client:

	| listener socket server |
	"This is just to set up a TCP socket connection, nothing to do with SSH2"
	listener := SocketAccessor family: SocketAccessor AF_INET type: SocketAccessor SOCK_STREAM.
	listener soReuseaddr: true.
	listener bindTo: (IPSocketAddress hostAddress: IPSocketAddress thisHost port: 2222).
	[ socket := listener listenFor: 1; accept ] ensure: [ listener close ].
	"Now we have a socket and can set up an SSH2 connection on it, here playing the server side"
	server := SSH2ServerConnection on: socket.
	"This is just to have all SSH messages echoed to transcript"
	server when: SSH2Announcement do: [ :m | Transcript cr; print: m ].
	[ 	"Server normally doesn't do much beyond accepting the client handshake and then waiting for a disconnect.
		Everything is initiated by the client side and handled by background threads handling any established channels."
		server accept; waitForDisconnect
	] ensure: [ server close. socket close ]

The client side interaction looks something like the following:

[mkobetic@latitude ~]$ ssh -p 2222 localhost 3 + 4
7

Not particularly impressive output so let me also add what this interaction logged into the Transcript (as requested in the example code). It describes the entire message exchange between the client and the server:

-> identification ['Xtreams_Initial_Development']
-> KEXINIT
<- identification ['OpenSSH_5.5']
<- KEXINIT
<- KEXDH_INIT
-> KEXDH_REPLY
-> NEWKEYS
<- NEWKEYS
<- SERVICE_REQUEST ssh-userauth
-> SERVICE_ACCEPT ssh-userauth
<- USERAUTH_REQUEST martin@ssh-connection none
-> USERAUTH_FAILURE #('publickey')
<- USERAUTH_REQUEST martin@ssh-connection publickey ssh-dss 5c:d1:c7:c8:27:48:8c:1a:fe:83:1d:7b:3c:09:49:6d no sig
-> USERAUTH_PK_OK
<- USERAUTH_REQUEST martin@ssh-connection publickey ssh-dss 5c:d1:c7:c8:27:48:8c:1a:fe:83:1d:7b:3c:09:49:6d with sig
-> USERAUTH_SUCCESS
<- CHANNEL_OPEN(0) session 2097152/32768
-> CHANNEL_OPEN_CONFIRMATION(0)(0) 2097152/32668
<- CHANNEL_REQUEST(0) !  env LANG -> en_US.utf8
<- CHANNEL_REQUEST(0) ?  exec 3 + 4
-> CHANNEL_SUCCESS(0)
-> CHANNEL_DATA(0)[2]
-> CHANNEL_EOF(0)
-> CHANNEL_CLOSE(0)
<- CHANNEL_CLOSE(0)
<- DISCONNECT 11 disconnected by user
-> DISCONNECT 11 BY_APPLICATION

And that is not all, it can also upload/download files or directories in either direction (server -> client, client -> server) or execute remote shell commands from smalltalk client on a remote OpenSSH server. Here’s an example of how to make a smalltalk client talk to an OpenSSH server. The example includes the code needed to read the default user keys from $HOME/.ssh directory and making the socket connection:

	| home user keys socket client keys config |
	"The bulk of this is loading up your personal keys from your $HOME/.ssh directory as they are needed to successfully authenticate with the server"
	home := '$(HOME)' asLogicalFileSpecification asFilename.
	user := home tail.
	keys := SSH2Keys new.
	((home / '.ssh' filesMatching: 'id_*') reject: [ :fn | '*.pub' match: fn ]) do: [ :fn || pub pri |
		pri := fn asFilename readStream.
		pri := ([ CertificateFileReader new readFrom: pri ] ensure: [ pri close ]) any asKey.
		pub := (fn, '.pub') asFilename reading encoding: #ascii.
		(pub ending: $ ) -= 0.
		pub := [ Xtreams.SSH2HostKey readFrom: pub encodingBase64 ssh2Marshaling ] ensure: [ pub close ].
		pub := keys publicKeyFrom: pub.
		keys addPublic: pub private: pri ].
	"Now we have the keys and can set up an SSH configuration to use them."
	config := SSH2Configuration new keys: keys.
	"Create a socket"
	socket := SocketAccessor newTCPclientToHost: 'localhost' port: 22.
	"Set up an SSH client connection on it.
	client := SSH2ClientConnection on: socket.
	client configuration: config.
	"This is just so that all SSH messages are echoed into the Transcript"
	client when: SSH2Announcement do: [ :m | Transcript cr; print: m ].
"	client when: SSH2TransportMessage, SSH2ChannelSetupMessage, CHANNEL_CLOSE do: [ :m | Transcript cr; print: m ].
"	[	"A client has to connect as particular user (using the preconfigured keys) and gets a channel service in response"
		service := client connect: user.
		"A channel service can provide an interactive session or a tunnel.
		You can ask for as many sessions, tunnels as you want, each will get its own channel multiplexed over the same SSH connection."
		session := service session.
		"Given a session you can execute a command, or upload/download a file or directory, etc..."
"		[	session exec: 'ls -l'.
		] ensure: [ session close ].
"		[	[ session scpUploadFrom: 'ssh.im' to: '/dev/shm/' ] timeToRun
		] ensure: [ session close ]
	] ensure: [ client close. socket close ]

I also started playing with a “shell” session with a smalltalk server, but rather than invoking or emulating bash, I wanted to run a simple read/eval/print loop in smalltalk instead. Having that, one could use the ssh command to connect to a smalltalk server securely and execute smalltalk expressions on it. It is basically working as is, except the smalltalk side has to do at least basic level of terminal emulation. A simple CR returned from the server moves the cursor in the terminal down one line but doesn’t move it back to the left. That one would be easy, but it also seems that the default terminal setup expects the server to echo what is typed into the terminal (I couldn’t see what I was typing in my experiments). So I’ll need yet another piece, basic terminal emulation layer to make this work reasonably.

Performance is looking good as well. My primary test is uploading/downloading a reasonably large file using scp. Here’s a transcript of a terminal session uploading a file to both an OpenSSH server and a smalltalk server:

[mkobetic@latitude 78]$ ll ssh.im
-rw-rw-r-- 1 mkobetic mkobetic 65M Dec 15 16:14 ssh.im
[mkobetic@latitude 78]$ scp ssh.im mkobetic@localhost:/dev/shm/
ssh.im                                                                               100%   64MB  32.1MB/s   00:02    
[mkobetic@latitude 78]$ scp -P2222 ssh.im mkobetic@localhost:/dev/shm/
ssh.im                                                                               100%   64MB  21.4MB/s   00:03

And here is the same just transfering the file in the opposite direction, downloading it from the server:

[mkobetic@latitude 78]$ scp mkobetic@localhost:st/78/ssh.im /dev/shm
ssh.im                                                                               100%   64MB  32.1MB/s   00:02    
[mkobetic@latitude 78]$ scp -P2222 mkobetic@localhost:ssh.im /dev/shm
ssh.im                                                                               100%   64MB  32.1MB/s   00:02

The commands with the -P2222 option are the ones running against smalltalk server (2222 was the port where it listened). The upload is somewhat slower (a different data stream setup is used when sending a file and when receiving one), but the download speed is on par. There are several critical aspects that you need to keep in mind when you want an efficient implementation.

1) You can’t come even close to the bulk encryption and hashing speed with a pure smalltalk implementation (at least not with any of the smalltalks that are currently available as far as I know). Just the overhead of indexed variable access in ByteArrays will kill you (last time I looked accessing an indexed instance variable in VisualWorks was about four times slower than accessing a named instance variable). Moreover the other side is most likely calling optimized (possibly pure assembler) implementations from libcrypto or some such. So don’t even try. That’s why we didn’t think twice about implementing the cryptographic streams in Xtreams by calling libcrypto (from OpenSSL) to do the heavy lifting. Arguably that’s cheating, but I don’t think it’s particularly different from calling other low level primitives in the VM. A symmetric cipher (e.g. AES, RC4,…) or a secure hash (SHA, MD5,..) is a specialized bit-twiddling algorithm. Implementing it in smalltalk is educational and fun, but they really aren’t practical in many contexts. There are optimized implementations of all of them available on any OS these days, so I think it’s only reasonable to take advantage of that. Moreover, many application contexts require cryptographic algorithm implementations to be certified (e.g. FIPS 140-2), other applications may require hardware accelerated implementations, so leaving it to external facilities is the most pragmatic choice.

2) Even if you do decide to “outsource” bulk encryption and hashing, you need to do it the right way. Calls outside of smalltalk are expensive, so you want to make them worth it. You cannot call out for every byte or two of data. You must send entire buffers to be processed. Xtreams employs 32K buffers by default. That seems to be sufficiently large to offset any costs of calling C (at least in VW).

3) You must avoid expensive garbage. However note the emphasis on expensive. You don’t need to skimp on every little object. The new space scavenging scheme can chew through megabytes of transient objects in no time. The expensive objects are the ones that make it to the old space but don’t survive too long after that. One particular type of objects that tends to fall into that category are the large ByteArrays used as buffers. It doesn’t take too many of those allocated in rapid sequence to overflow the new space, causing many of them tenuring into the old space. Since they are large they will quickly kick the incremental garbage collector into action. Suddenly you’re spending more time garbage collecting than doing the real work. So it’s critical to reuse buffer objects. If you can’t ensure that within your own code, Xtreams come with a built in RecyclingCenter, which serves as an overflow staging area for buffers, so that they can be picked up and reused, when the application is chewing through a lot of them.

And that’s it, that’s what I believe are the essential ingredients needed to make Xtreams able to measure up to plain C. And it seems that the results confirm that. So, where to go from here? I still have a few implementation issues listed in theXtreams-SSH2 package comment . I’d like to add the necessary bit of terminal emulation to make the ssh shell session with smalltalk server possible. I may add TCP tunneling support, just for completeness, we’ll see. I definitely want to experiment with different approaches for implementing the protocol state machine. I don’t like what I have in Xtreams-SSH2 now (and what’s in the SSL implementation either). I’m still searching for an approach that I’ll like and once I figure it out I’ll might do Xtreams-TLS as well.

Regarding the future of Xtreams-SSH2 package, I’m not sure how useful it can be in practice (assuming all is done and polished). Do you think you’d use it for scp upload/download directly to/from smalltalk? Would you use a secure login into a smalltalk server ? I don’t think there’s much point in building yet another general purpose SSH server/client, OpenSSH already does that job rather well. Where I think it might be interesting are smalltalk specific projects and applications. For example SSH has this notion of “subsystems” and you can define your own. The only one I know of currently is the sftp subsystem. But the sky is the limit in terms of coming up with new ones. Anyway, if you have ideas for useful applications of a native smalltalk SSH implementation, let me know.

I might write a few more posts on particular implementation details, either from the point of view of how to solve particular problem using Xtreams, or just as an educational bit about SSH in general. If there’s something about this project that interests you, let me know. I should add, that should you feel particularly bored and want to try this out, the package is available in Cincom Public Repository . It should work immediately in any sufficiently recent release of VisualWorks. The code should be fairly well portable to Squeak/Pharo, but it depends on Xtreams-Xtras that weren’t ported yet. I tried to contain the VisualWorks specific bits in the SSH2Keys class which encapsulates the use of RSA/DSA keys and algorithms and currently relies on the VisualWorks Security library. I hope to get around to retargeting it onto the EVP primitives in libcrypto, which would make it the same sort of deal as Xtreams-Xtras (possibly eventually merged into it as well).

– Posted by Martin Kobetic

 

ResourcefulTestCaseToo

Good while ago I posted ResourcefulTestCase talking about a simplified pattern for test resources in the context of SUnit. Basic idea was that if you tend to group your test cases around their required resource you often end up with a package where many TestCase classes are mirrored with TestResource classes one-to-one. In this situation it is more convenient to simply use the class side of the TestCase class as its resource and cut the amount of classes by half. Moreover, various resource aspects can be placed into class side variables and conveniently accessed from instance side test methods.

Recently I started using SUnitToo and wanted to try the same sort of pattern there. It turned out to be so simple, that it’s not even worth adding a dedicated abstract TestClass to emulate the pattern.

To declare that a test class is its own resource simply add following (rather obvious) class method

resources

	^Array with: self

Two more methods are needed to make the resource work: #isAvailable and #reset. They can be directly used to for set-up and tear-down, although note that #isAvailable must return a Boolean. Just return true at the end of the method and you’re set. You can return false to signal that resource failed to set-up but an exception will have the same effect.

Now, without repeating all the arguments, the original article recommends to use shared variables for various aspects of the resource. Here is the rest of an example of a hypothetical test resource:

isAvailable

	TestDirectory := '/dev/shm/testing' asFilename.
	TestDirectory ensureDirectoryExists.
	self generateTestContentIn: TestDirectory.
	^true

And don’t forget to nil out the shared variables in tear-down.

reset

	UnixProcess shOne: 'rm -r ', TestDirectory asString.
	TestDirectory := nil.

That’s it. The only boilerplate code is the rather trivial and obvious ‘self’ in the resources method.

– Posted by Martin Kobetic

Xtreams: Concatenating Streams

Did you ever run into a situation where you had a stream and some previously written chunk of code that could process the stream almost as is, if only the stream included few additional bytes in the beginning? Usually, I ended up just biting my lip and fetching the full content of the stream, prepending the missing bits and then setting up an internal stream on top of the collection. That’s assuming it was feasible to load the entire stream into memory. Wouldn’t it be lovely if I could simply prepend a stream in front of another stream and make the two look like one ? Let’s give it a try.

One of the things I think we did get right with Xtreams is that it’s xtremely easy to create full featured subclasses of ReadStream and WriteStream. In case of ReadStream the only required methods to implement are contentsSpecies andread:into:at: . That will give you a complete (non-positionable) stream. So let’s make a CompositeReadStream that adds following inst vars:

	source2  second source
	active  the currently active source

This stream should not be created with just on: , so let’s declare that #shouldNotImplement and add on:and: instead:

	on: aSource and: aSource2

		active := source := aSource.
		source2 := aSource2

With that in place it would be a mortal sin to not add ReadStream>>,

	, aReadStream
		"Return a read stream that combines self and @aReadStream into a single stream.
		""
			((1 to: 5) reading, (6 to: 10) reading) rest
		"
		^CompositeReadStream on: self and: aReadStream

The sample in the comment above shows how it’s intended to be used. Obviously we want the composite to produce the combined sequence of the elements from both sources. To get that we just need to implement read:into:at:

	read: anInteger into: aSequenceableCollection at: startIndex

		| count |
		count := 0.
		[	^active read: anInteger into: aSequenceableCollection at: startIndex
		] on: Incomplete do: [ :ex |
			count := ex count.
			active == source ifFalse: [ ex pass ].
			active := source2 ].
		"avoid making the recursive call in the handler"
		^[	self read: anInteger - count into: aSequenceableCollection at: startIndex + count
		] on: Incomplete do: [ :ex |
			(Incomplete on: aSequenceableCollection count: count + ex count at: startIndex) raise ]

The idea is simple, we start reading from source, when we run out of source we switch to source2.

To satisfy all implementation requirements we also need contentsSpecies , we’ll follow the species of the underlying source stream:

	contentsSpecies
		^source contentsSpecies

And that’s it. We can do something very similar for WriteStreams, although note that it only makes sense to concatenate streams of limited growth, e.g.:

	| stream |
	stream := (String new writing limiting: 1), (String new writing limiting: 6).
	stream := (String new writing limiting: 5), stream.
	stream write: 'Hello World!'; close; terminal

yielding #(‘Hello’ (‘ ‘ ‘World!’)). If the first write stream grows without restrictions, then you’ll keep writing into that one and never into the second one.

Now this was too easy, let’s try something more xtreme. If we could make the composite stream add additional sub-streams on demand, we could use it for example to cut up arbitrary sentence into words. One way to achieve that is the having the second stream in the composite be something that can turn itself into another composite with itself in the second position again. As soon as the first stream fills up we need to trigger the transformation of this stream prototype in the second position into the same kind of composite as the one we started with. This kind of setup can accommodate arbitrarily long input on demand.

Let’s call this prototype stream a ProtoWriteStream . Obviously the prescription how to turn it into a real stream is a block. For transparency let’s trigger the transformation with any write related message send. Here’s the corresponding code for ProtoWriteStream as a direct subclass of WriteStream.

	write: anInteger from: aSequenceableCollection at: startIndex

		self become: destination value.
		^self write: anInteger from: aSequenceableCollection at: startIndex

	contentsSpecies

		self become: destination value.
		^self contentsSpecies

We’re re-using the destination slot to hold the transformation block, that way the on: creation method can be reused as well. To make it easier to create stream prototypes, let’s add BlockClosure>> writingPrototype as well.

	writingPrototype

		^Xtreams.ProtoWriteStream on: self

Additionally we need to override close and flush to be noops and we can accomplish the stated task as follows.

	| prototype stream words |
	words := OrderedCollection new.
	prototype :=
		[	((words add: String new) writing ending: Character space)
			, prototype writingPrototype ].
	stream := prototype value.
	stream write: 'the quick brown fox jumps over the lazy dog'; close.
	words

Similarly we can play the same game with read streams. Let’s try to re-compose the words into a single stream.

	| prototype stream  words |	
	words := ('the quick brown fox jumps over the lazy dog' tokensBasedOn: Character space) reading.
	prototype := [ words get reading, prototype readingPrototype ].
	stream := prototype value.
	stream rest

Obviously we could simply create all streams and concatenate them at once

	(('the quick brown fox jumps over the lazy dog' tokensBasedOn: Character space)
		inject: '' reading into: [ :all :word | all, word reading ]
	) rest

However the prototype based solution has the advantage of creating the sub-streams lazily, so if you don’t need to consume the whole input, you don’t waste the extra effort on the part that you’ll just throw away.

If you want to play with these concepts, the concatenation support is now part of Xtreams-Xtras. The CompositeReadStream is even positionable if all its components are positionable as well. But I’m less confident about that part and haven’t even implemented it for write streams yet. The proto streams are available in a new package Xtreams-Xperiments we’ve started. You’ll get it automatically if you load the whole XtreamsDevelopment bundle.

– Posted by Martin Kobetic

F-Spot, Glorp and VisualWorks

I’ve been using Linux as my primary desktop platform for some years now. I generally try to keep up with the releases and stick with the default choices as much as possible. Recently I tried to use F-Spot because it’s the default photo manager for GNOME now. It’s got some nice features and is generally OK, although not very flexible. Very much in the spirit of today’s UI design dogmas (“You can’t handle flexibility!”). Anyway, I noticed that F-Spot uses sqlite3 as its database, so I wasn’t too afraid to spend some effort tagging pictures etc.

Recently, as I was upgrading my computers, I decided to move the pictures to a different location. Unfortunately F-Spot doesn’t seem to provide a way to update its database accordingly. Poking around in the database it seemed to be fairly simple database update, so I decided to whip up a quick, Glorp based, database mapping and do the update with a script.

The database has a PHOTOS table with following definition:

CREATE TABLE photos (
	id			INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, 
	time			INTEGER NOT NULL, 
	base_uri		STRING NOT NULL, 
	filename		STRING NOT NULL, 
	description		TEXT NOT NULL, 
	roll_id			INTEGER NOT NULL, 
	default_version_id	INTEGER NOT NULL, 
	rating			INTEGER NULL, 
	md5_sum			TEXT NULL
);

The path to the picture is stored in the base_uri field, usually looking something like ‘file:///home/user/Photos/…’. I needed to change all of them to something like ‘file:///pub/photos….’ instead. So, first I created a Photo class with a simplified version of the above:

	id  database id
	time  time taken
	base_uri  location of the photo
	filename  location of the photo
	description  any notes

Mappings are defined on subclasses of DescriptorSystem, so I created one and started with description of the class model:

classModelForPhoto: aModel

	aModel newAttributeNamed: #id.
	aModel newAttributeNamed: #time type: Timestamp.
	aModel newAttributeNamed: #base_uri type: String.
	aModel newAttributeNamed: #filename type: String.
	aModel newAttributeNamed: #description type: String.

then the table description.

tableForPHOTOS: aTable

	(aTable createFieldNamed: 'id' type: (self fakeSequenceFor: aTable)) bePrimaryKey.
	(aTable createFieldNamed: 'time' type: platform int4) beIndexed.
	aTable createFieldNamed: 'base_uri' type: platform varchar.
	aTable createFieldNamed: 'filename' type: platform varchar.
	aTable createFieldNamed: 'description' type: platform text.

and finally the mapping between the two:

descriptorForPhoto: aDescriptor

	| table |
	table := self tableNamed: 'PHOTOS'.
	aDescriptor table: table.
	(aDescriptor newMapping: DirectMapping) from: #id to: (table fieldNamed: 'id').
	(aDescriptor newMapping: DirectMapping) from: #base_uri to: (table fieldNamed: 'base_uri').
	(aDescriptor newMapping: DirectMapping) from: #filename to: (table fieldNamed: 'filename').
	(aDescriptor newMapping: DirectMapping) from: #time to: (table fieldNamed: 'time').
	(aDescriptor newMapping: DirectMapping) from: #description to: (table fieldNamed: 'description').

 

With these in place I could try to connect to the database. For that I needed to provide the connection information, so I added 2 class side methods:

newLogin

	^(Login new)
		database: SQLite3Platform new;
		connectString: (PortableFilename named: '$(HOME)/.config/f-spot/photos.db') asFilename asString.

newSession

	^self sessionForLogin: self newLogin

With this I could invoke #newSession and get a connected session back. Time to start experimenting with the database.

Reading a photo is easy:

	session readOneOf: Photo.

To figure out what are all the places from which I’ve imported pictures I used this:

	query := (Query read: Photo) retrieve: [ :e | e base_uri ].
	(session execute: query) asSet.

It reads all the base_uri values and puts them into a Set. A smarter database query can do this more efficiently, but this was fine as well in my database of about 6k pictures. I found out I imported pictures from two locations. I decided to deal with them one by one. To perform the update I ran the following:

	photos := session read: Photo where: [ :p | p base_uri like: '%home/mk/Photos%' ].
	session modify: photos in: [
		photos do: [ :p | p base_uri: (p base_uri copyReplaceAll: 'home/mk/Photos' with: 'pub/photos') ] ].

It reads each photo with the selected location in base_uri and updates it with the new one. Then I did the same for the second location. The entire update operation took less than 20 seconds. Later I found out that there’s a plugin for F-Spot for this sort of migration, but its comment said that it can take a few hours. I don’t know how big a database they had in mind, but that sounds a bit excessive still.

Since then I fleshed out the mappings, created a Glorp Workbook so that it’s more convenient for quick experiments (you get a toolbar button for easy access) and packed it all up. I published the package to the public repository as F-Spot, hoping it might be useful to someone else too. As far as future plans go, there really aren’t any beyond finishing the mapping layer. One thing I’m considering is that I find the imports into F-Spot excruciatingly slow. I might use this package for that task instead.

– Posted by Martin Kobetic

WebSupport updates

Our Seaside effort yields some useful byproducts including improvements to the, so far rather Spartan, WebSupport package. This package now provides HttpClient and HttpRequest extensions simplifying submission of HTML form data through HTTP POST method.

In general, form data can be submitted in a “url encoded” format in a simple, single-part HTTP request (content-type: application/x-www-form-urlencoded), or each data entry can be submitted as an individual part in a multipart HTTP request (content-type: multipart/form-data). Multipart messages are used when form data contains entries with relatively large values, for example when a form has external files attached to it for upload to the server. More information about HTML forms can be found at http://www.w3.org/TR/html401/interact/forms.html#h-17.13.

The default behavior of WebSupport extensions is to submit forms as simple requests. Form entries can be added individually using #addFormKey:value: message, or set at once using #formData: message which takes a collection of Associations. Note that #formData: replaces any previous form content. The following example

	stream := String new writeStream.
 	(HttpRequest post: 'http://localhost/xx/ValueOfFoo')
		addFormKey: 'foo' value: 'bar';
		addFormKey: 'file'  value: 'myFile';
		writeOn: stream.
	stream contents

yields this result:

POST /xx/ValueOfFoo HTTP/1.1
Host: localhost
Content-type: application/x-www-form-urlencoded
Content-length: 19

foo=bar&file=myFile'

An alternative way to post a form is through HttpClient, in this case the request gets automatically executed and the result is the response from the server.

	HttpClient new
		post: 'http://localhost/xx/ValueOfFoo' 
		formData: (
			Array
				with: 'foo' -> 'bar';
				with:'file' -> 'myFile').

To force the form to submit as a multipart message, send #beMultipart to the request at any point. Any previously added entries will be automatically converted to message parts. Note however that conversion of multipart messages back to simple messages is not supported, as it is not always possible without potentially losing information.

	stream := String new writeStream.
 	(HttpRequest post: 'http://localhost/xx/ValueOfFoo')
		beMultipart;
		addFormKey: 'foo' value: 'bar';
		addFormKey: 'file'  value: 'myFile';
		writeOn: stream.
	stream contents

and the result is

POST /xx/ValueOfFoo HTTP/1.1
Host: localhost
Content-type: multipart/form-data;boundary="=_vw0.98992842109405d_="
Content-length: 183

--=_vw0.98992842109405d_=
Content-disposition: form-data;name=foo

bar
--=_vw0.98992842109405d_=
Content-disposition: form-data;name=file

myFile
--=_vw0.98992842109405d_=--

File entries can be added using message #addFormKey:filename:source:. Adding a file entry automatically forces the message to become multipart to be able to capture both the entry key and the filename.

	stream := String new writeStream.
	(HttpRequest post: 'http://localhost/xx/ValueOfFoo')
		addFormKey: 'foo' value: 'bar';
		addFormKey: 'text'  filename: 'text.txt' source: 'some text' readStream;
		writeOn: stream.
	stream contents
POST /xx/ValueOfFoo HTTP/1.1
Host: localhost
Content-type: multipart/form-data;boundary="=_vw0.015112462460581d_="
Content-length: 247

--=_vw0.015112462460581d_=
Content-disposition: form-data;name=foo

bar
--=_vw0.015112462460581d_=
Content-type: text/plain;charset=utf_8
Content-disposition: form-data;name=text;filename=text.txt

some text
--=_vw0.015112462460581d_=--

Adding a file entry attempts to guess the appropriate Content-Type for that part from the filename extension. If it doesn’t succeed the content type is set to default, i.e application/octet-stream. File names with non ASCII character will be automatically encoded using UTF8 encoding. UTF8 will also be used for the file contents if the source is a character stream (as opposed to byte stream).

Adding an entry to a multipart message returns the newly created part. That allows to modify any of the default settings or to add new ones. Here’s an example changing the filename and file contents encoding to ISO8859-2:

	stream := String new writeStream.
	request := HttpRequest post: 'http://localhost/xx/ValueOfFoo'.
	part := request addFormKey: 'czech'
				filename: 'kůň.txt'
				source: 'Příli¨ ¸luťoučký kůň úpěl ďábelské ódy.' withCRs readStream.
	part headerCharset: #'iso-8859-2';
		charset: #'iso-8859-2'.
	request writeOn: stream.
	stream contents
POST /xx/ValueOfFoo HTTP/1.1
Host: localhost
Content-type: multipart/form-data;boundary="=_vw0.74617905623567d_="
Content-length: 228

--=_vw0.74617905623567d_=
Content-type: text/plain;charset=iso-8859-2
Content-disposition: form-data;name=czech;filename="=?iso-8859-2?B?a/nyLnR4dA==?="

Pøíli¹ ¾lu»ouèký kùò úpìl ïábelské ódy.
--=_vw0.74617905623567d_=--

There’s also an API to parse messages containing forms in any of the supported forms. Just send #formData to the HTTP message. The result is a collection of associations, the same form as the input to the #formData: message.

 	(HttpRequest post: 'http://localhost/xx/ValueOfFoo')
		addFormKey: 'foo' value: 'bar';
		addFormKey: 'file'  value: 'myFile';
		formData
OrderedCollection ('foo'->'bar' 'file'->'myFile')

File entry values will be entire message parts so that all the associated information can be accessed.

	request := (HttpRequest post: 'http://localhost/xx/ValueOfFoo')
		addFormKey: 'foo' value: 'bar';
		addFormKey: 'text'  filename: 'text.txt' source: 'some text' readStream;
		yourself.
	part := request formData last value.
	part contents
some text

If you’d like to give the new code a try just load it up from the public repository.

– Posted by Martin Kobetic

ResourcefulTestCase

The core SUnit package provides support for shared test resources via the TestResource class. A TestCase that wants to use TestResources is expected to list all its resource classes in its class side #resources method. Individual test case methods then access the resources via the resource classes, usually as default, singleton instances. That provides potentially interesting levels of flexibility, however the access to the resources themselves is not exactly convenient. In my experience vast majority of cases involving TestResources either keep repeating the ‘self resources first default blah’ incantation over and over again, where blah is the name of the real resource the case cares about which is being managed in a blah instance variable of the corresponding TestResource subclass. A more palatable way is adding the same instance variables to the TestCase subclass as well and copying the resource pieces there in the TestCase>>setUp method. Then you can access the resources directly as instance variables in your test case methods. This way the test methods are clean again, but when you want to employ test resources you still have to go through the following sequence of steps:

  1. create a TestResource subclass
  2. add it to the TestCase class>>resources method
  3. add the same set of instance variables to the TestCase subclass
  4. copy the contents of the TestResource default instance to the TestCase instance variables in TestCase>>setUp method

Naturally as you realize you’re doing this over and over again, you start thinking, wouldn’t it be nice if you didn’t have to. My first thought was, what if the TestSuite that gets built out of a given TestCase subclass didn’t simply create empty TestCase instances, but instead first created a prototype instance, invoke something like #suiteSetUp on it, which would allow to initialize the shared resources and put them directly into the instance variables of the case prototype. Then all the test cases would be built by simply copying the prototype instance and therefore would have the resource inst vars automagically initialized the same way as the prototype. Tear down could be performed by invoking #suiteTearDown on either the prototype or any of the cases. It doesn’t really matter which one, you just have to make sure it is executed exactly once.

Of course while I was enjoying myself following this thread of thought, I forgot one important detail. TestCase instances are not created just before the suite runs. They are often created much much earlier. That of course directly conflicts with the desire to initialize the resources just before they are needed and shutting them down right after the run is complete. I was just about to descend into yet another desperate attempt to rewrite most of SUnit when Alan Knight, who happened to be traveling with me at that moment, responded to my loud complaints with simple: “Just put them into class variables.”

And so the ResourcefulTestCase was born. It is an abstract TestCase subclass with simplified support for shared test resources. Instead of separate classes of test resources, it makes sure that if there are class side #setUp and #tearDown methods defined on the class, they run before and after the test suite that gets built out of this test class. This allows one to initialize and store shared resources in class side variables in the #setUp method. It’s probably easiest to use shared class variables for easy access from the instance side test case methods. I also like that the capitalized first letter nicely highlights in the test code the difference between the shared resources and the private stuff in case instance variables. Obviously the resources need to be torn down in the class side #tearDown method. It is usually also desirable to nil out the variables as well so that the class doesn’t hand on to garbage. The nilling out could be done automatically with a bit of meta-programming, but since it’s often also important to finalize/close/release the resources properly as well, I figured it’s better to force the user to deal with that, rather than facilitate potentially serious leaks with more automated magic. It’s probably also a good idea to call super implementations of the setUp/tearDown methods so that case hierarchies work well.

Here’s a quick summary of how to create a test case with resources:

  1. Make the test class a subclass of ResourcefulTestCase:
    XProgramming.SUnit defineClass: #MyTest
    	superclass: #{XProgramming.SUnit.ResourcefulTestCase}
    	indexedType: #none
    	private: false
    	instanceVariableNames: ''
    	classInstanceVariableNames: ''
    	imports: ''
    	category: 'My Tests'
  2. Add class variables for the shared resources.
  3. Add class side setUp method and initialize the test resources there
    MyTest class>>setUp
    
    	A := Object new.
    	Client := HttpClient new connect: 'testserver'
  4. Add class side tearDown method releasing the resources
    MyTest class>>setUp
    
    	A := nil.
    	Client close.
    	Client := nil.

And that’s it. All nicely and intuitively packaged in a single class. Now you can simply use the resources in your test methods:

MyTest>>testResources

	self assert: A class == Object.
	self assert: (Client get: 'index.html') isSuccess

The supporting code is published in the public repository in a package called SUnit-SimpleResources

– Posted by Martin Kobetic