Daniel Fortunov's Adventures in Software Development »
0 Comments- Add comment Written on 08-Jun-2010 by asqui
Concurrent Programming on Windows (by Joe Duffy) is a book so good I had to put it down, frequently, to stop and think. The information density is pretty high and I often found myself staring blankly into space for minutes at a time, book open on my lap, thinking through what I’d just read.
This is probably the definitive book on concurrency in Windows, covering general principles and the relevant APIs across both native (Win32) and managed (.NET). It has a good balance of theoretical discussion and practical advice, with no shortage of references at the end of each chapter for those who feel inclined for some additional background reading. (For instance, the “Further Reading” section at the end of Chapter 10 “Memory Models and Lock Freedom” points to some light reading: AMD x86-64 Architecture Programmer’s Manual, Volumes 1–5 (!))
What makes this book truly valuable is the amount of information and knowledge that it aggregates, from obscure technical sources, academic papers, and even first-hand spelunking in the Windows source code to find answers to some undocumented behavioural details. It also provides plenty of practical advice garnered from years of experience.
For me, having mainly managed programming experience, this book provided a nice opportunity to understand more about the underlying obscurities of Win32, and how these relate to and contrast with what is exposed in .NET. Having that underlying knowledge has let me see how passing the Invalid Wait Handle value to some asynchronous methods can make them execute synchronously instead, and to understand that asynchronous I/O needs to be decided on when a file handle is opened — details that had previously eluded me in my journeys through managed-land.
Other things that were interesting to learn about included lock-free algorithms (with clever tricks like structuring a lock-free linked list such that it has a sentinel node when empty, cunningly avoiding the problem of updating two pointers when the list transitions between empty and non-empty), and the details of kernel-mode synchronisation primitives, with their limitless caveats (the abandoned mutex scenario was my favourite… when waiting on a named mutex it is possible that it would have been abandoned if another process exited before releasing it. Despite returning an error, the operation has succeeded in acquiring the mutex and you must still remember to release it! As if you didn’t have enough to think about by that point, with all the other complexities around alertable waits and pumping the message queue if you’re in an STA).
I previously said that, at a length of 736 pages, CLR via C# (2nd Edition) was the largest book I have ever read. But with a length of 930 pages, Concurrent Programming on Windows has surpassed this. Next up on the reading list: CLR via C# (3rd Edition).
1 Comment- Add comment Written on 10-May-2010 by asquiAnother useful snippet of knowledge gained from reading Concurrent Programming on Windows (by Joe Duffy):
Did you know that asynchronous file I/O in .NET is not just about calling FileStream.BeginRead() or BeginWrite() in place of Read() or Write()? You should also make sure that the FileStream is opened for asynchronous operations, otherwise you’ll quietly get less performant ‘mock’ async operations that just execute synchronous I/O on the thread pool, rather than using true overlapped I/O at the Win32 level.
The natural starting point for creating a FileStream is the static File.Open() method, the documentation for which mentions nothing about synchronicity of the FileStream that is created! Nor does it allow you to provide FileOptions (which are used to specify the magic FileOptions.Asynchronous flag).
Instead, the FileStream is created with FileOptions.None. Any asynchronous operations are quietly faked by the obliging implementation of the Stream base class, which merely wraps the corresponding synchronous method in a delegate and invokes it on the thread pool using the BeginInvoke() method.
This is a deviation from the usual ‘pit of success’ design philosophy, where everything in .NET seems to work as you think it would, without a need to closely read the documentation and/or gradually discover obscure catches and gotchas over time.
Admittedly I’ve never actually used asynchronous file I/O (for the applications I’ve worked on have used databases, queues, and other remote data persistence rather than local files) or else I might have read the FileStream.BeginRead() and BeginWrite() documentation a little more closely:
FileStream provides two different modes of operation: synchronous I/O and asynchronous I/O. While either can be used, the underlying operating system resources might allow access in only one of these modes. By default, FileStream opens the operating system handle synchronously. In Windows, this slows down asynchronous methods. If asynchronous methods are used, use the FileStream(String, FileMode, FileAccess, FileShare, Int32, Boolean) constructor.
That last Boolean parameter to the FileStream constructor is called useAsync and, if true, results in FileOptions.Asynchronous being used (or you can also use the other constructor overload which takes FileOptions in the last parameter, and specify FileOptions.Asynchronous yourself).
The underlying Stream.BeginRead() and BeginWrite() methods also talk about synchronicity:
The default implementation of BeginRead on a stream calls the Read method synchronously, which means that Read might block on some streams. However, instances of classes such as FileStream and NetworkStream fully support asynchronous operations if the instances have been opened asynchronously. Therefore, calls to BeginRead will not block on those streams. You can override BeginRead (by using async delegates, for example) to provide asynchronous behavior.
I think this documentation is out of date, or at least a little unclear. The default implementation of BeginRead does not call Read synchronously — Reflector shows that it calls Read by wrapping it in a delegate and calling BeginInvoke, which would result in it being called on a thread pool thread. This is an asynchronous call (with respect to the caller of BeginRead).
Perhaps the documentation is out of date, since it also suggests "using async delegates" to implement your own asynchronous behaviour — what advantage would that give you over the default implementation which does just the same?
As ever, the truth lies in Reflector.
In summary, if you want to do asynchronous file I/O:
Finally, if you’re using asynchronous I/O you must care about performance, so don’t forget to measure, measure, measure! And heed the warning hidden in the documentation of that useAsync parameter:
useAsync Specifies whether to use asynchronous I/O or synchronous I/O. However, note that the underlying operating system might not support asynchronous I/O, so when specifying true, the handle might be opened synchronously depending on the platform. When opened asynchronously, the BeginRead and BeginWrite methods perform better on large reads or writes, but they might be much slower for small reads or writes. If the application is designed to take advantage of asynchronous I/O, set the useAsync parameter to true. Using asynchronous I/O correctly can speed up applications by as much as a factor of 10, but using it without redesigning the application for asynchronous I/O can decrease performance by as much as a factor of 10.
2 Comments- Add comment Written on 09-Feb-2010 by asquiILMerge is a utility from Microsoft Research that combines multiple .NET assemblies into a single assembly. This is convenient when you want to combine your application and its dependencies into a single DLL file, for example, to make deployment and versioning easier.
ILMerge is released as a console application but also exposes an API to allow you to use it in other applications. For example, I see there are some GUI applications to ease the burden of typing in all those command line switches. ILMerge is mysteriously missing from the community collections of MSBuild tasks, such as the SDC Tasks Library and MSBuild Extended Tasks, probably because it is perfectly feasible to invoke the ILMerge executable using the Exec task that is provided with MSBuild.
The goal is to integrate ILMerge into MSBuild, such that it runs automagically every time the project is built (either within Visual Studio, or with MSBuild from the command line).
Unfortunately there are some interesting details to integrate smoothly into the build, such as making sure the task handles incremental builds properly (so that adding ILMerge to one project in a solution doesn’t force a re-build of that entire sub-tree every time you build!)
I’ve not been able to find an adequate pre-canned way to achieve this, but I’ve hacked something together starting from Jomo Fisher’s solution and addressing some of the shortcomings I found along the way.
Hand-edit your MSBuild project (e.g. *.csproj) file to tag the referenced assemblies you’d like to merge with the ILMerge=True metadata, like this:
<Reference Include="DependencyLibrary, Version=1.0.0.0, Culture=neutral, processorArchitecture=MSIL">
<SpecificVersion>False</SpecificVersion>
<HintPath>Referenced Assemblies\DependencyLibrary.dll</HintPath>
<ILMerge>True</ILMerge>
<Private>False</Private>
</Reference>
(Note that it is not necessary to set CopyLocal=True for the target assemblies.)
Then, define the following targets and properties at the bottom of your MSBuild project (just above the </Project> tag):
<Target Name="AfterBuild" DependsOnTargets="ILMerge" />
<PropertyGroup>
<ILMergeExecutable>"..\BuildTools\ILMerge\ILMerge.exe"</ILMergeExecutable>
<KeyFile>"$(ProjectDir)MyApplication.snk"</KeyFile>
</PropertyGroup>
<Target Name="ILMerge" Inputs="@(IntermediateAssembly)"
Outputs="@(MainAssembly -> '%(RelativeDir)%(Filename).ILMergeTrigger%(Extension)')">
<CreateItem Include="@(ReferencePath)" Condition="'%(ReferencePath.ILMerge)'=='True'">
<Output TaskParameter="Include" ItemName="ILMergeAssemblies" />
</CreateItem>
<Exec Command="$(ILMergeExecutable) /Closed /Internalize /Lib:$(OutputPath) /keyfile:$(KeyFile) /out:@(MainAssembly) "@(IntermediateAssembly)" @(ILMergeAssemblies->'"%(FullPath)"', ' ')" />
<!-- Make a copy of the merged output DLL to use as a trigger for incremental builds -->
<Copy
SourceFiles="@(MainAssembly)"
DestinationFiles="@(MainAssembly -> '%(RelativeDir)%(Filename).ILMergeTrigger%(Extension)')" />
</Target>
Here's the full wolking solution:
ILMergeExperiments
There are a couple of hacks here to deal with the fact that we want our ILMerged assembly to have the same name as the original:
This is somewhat hacky, and I’m sure there must be a more cunning way to integrate into MSBuild; I’ll have to revisit this once I’ve read the book Inside the Microsoft Build Engine: Using MSBuild and Team Foundation Build.
0 Comments- Add comment Written on 31-Jan-2010 by asquiOne of the obscure gems garnered from my current reading of the book Concurrent Programming on Windows (by Joe Duffy) is an insight into the INVALID_HANDLE_VALUE constant.
In Win32 programming, functions that return a HANDLE (such as CreateFile) may return INVALID_HANDLE_VALUE to indicate failure (sometimes). You can check for this return value and call GetLastError to find out why the operation failed.
In .NET functions typically indicate unexpected failure by throwing an exception. The endless dance of “Do something; Did it succeed? If not, why did it fail. Do something else; Did it succeed? …” is replaced by structured exception handling and constructs such as try-catch, which let you defer thinking about error scenarios until you want to, rather than thinking… about errors… at every… step… of… the… way.
So if .NET methods such as File.Open() will throw exceptions rather than returning INVALID_HANDLE_VALUE we have no need to expose INVALID_HANDLE_VALUE in the .NET BCL, right? Not quite.
In addition to being used as a magic return value indicating failure, INVALID_HANDLE_VALUE also has some magic powers with methods that accept a HANDLE as a parameter. Now, you won’t get any useful behaviour from passing INVALID_HANDLE_VALUE to CloseHandle, however there is a group of functions that let you provide an event handle, do some asynchronous work, and then signal your event to let you know the work has been completed.
Functions such as UnregisterWaitEx and DeleteTimerQueueTimer will cancel any pending registered wait operation or a timer-queue timer, however if a callback has already been triggered this will still run to completion. If you need to clean up any resources used by your callback, to avoid pulling the rug out from under its feet, you must first ensure that your callback is not still executing. To avoid having to manually introduce control synchronisation in your callback, UnregisterWaitEx and DeleteTimerQueueTimer let you provide an event handle which will be signalled when any executing callbacks have returned.
If you don’t want the overhead of allocating another event and then registering a wait on it (in order to perform the clean-up asynchronously, when the event is signalled) you can tell the function to block and wait for any executing collback functions to complete before returning by providing INVALID_HANDLE_VALUE for the wait handle.
Now the interesting part: Since we previously concluded that there is no need to expose INVALID_HANDLE_VALUE in the .NET BCL, how would we get this handy blocking behaviour from the .NET equivalents to the methods mentioned above: RegisteredWaitHandle.Unregister(WaitHandle) and System.Threading.Timer.Dispose(WaitHandle)?
The MSDN documentation makes no suggestion that this behaviour is even possible (not even in the preview documentation for .NET 4). I’m not sure if this is an oversight or an intentionally unsupported behaviour.
To work around this we can do a little poking around the BCL with Reflector:
So the only way to get at INVALID_HANDLE_VALUE in .NET is to subclass WaitHandle. We don't actually need to do anything in our subclass, mind you:
public class InvalidWaitHandle : WaitHandle { }
So there you have it, pretty convoluted but works like a charm!
Here’s the full version, with documentation and a cached instance:
using System.Threading; /// <summary> /// An inert wait handle that can be used to avoid allocating a real event in /// some situations. /// </summary> /// <remarks> /// <para> /// An <see cref="InvalidWaitHandle"/> can be provided to methods such as /// <see cref="RegisteredWaitHandle.Unregister(WaitHandle)"/> and /// <see cref="Timer.Dispose(WaitHandle)"/>. In this case, the function waits /// for all callback functions to complete before returning, rather than /// returning immediately and signalling the provided wait handle /// asynchronously. /// </para> /// <para> /// Internally, this results in the use of INVALID_HANDLE_VALUE when calling /// the underlying Win32 functions. /// </para> /// <para> /// For further information, see "Concurrent Programming on Windows" (First /// Edition, 2009) by Joe Duffy, p. 374, 377. /// </para> /// </remarks> public class InvalidWaitHandle : WaitHandle { static InvalidWaitHandle() { Instance = new InvalidWaitHandle(); } /// <summary> /// Gets a shared instance of <see cref="InvalidWaitHandle"/> which may /// be re-used. /// </summary> /// <remarks> /// Using this field allows a single <see cref="InvalidWaitHandle"/> to /// be re-used as opposed to creating a <c>new</c> instance at every call /// site. /// </remarks> /// <value>A shared instance of <see cref="InvalidWaitHandle"/> which may /// be re-used.</value> public static InvalidWaitHandle Instance { get; private set; } }
2 Comments- Add comment Written on 28-Jul-2009 by asquiIf it looks like a duck, and quacks like a duck, then it must be a duck!
“In computer programming, duck typing is a style of dynamic typing in which an object's current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class or implementation of a specific interface.” — Wikipedia
With duck-typing an interface implementation is implicit once you have implemented the relevant members. .NET does not currently have any broad support for this, however, with the emergent dynamic language features, I wouldn't be surprised if this were supported natively by the runtime in the near future.
In the mean time, you can synthesise duck-typing via reflection, with a library such as this, which would allow you to do a duck-typed cast like this:
IDoo myDoo = DuckTyping.Cast<IDoo>(myFoo)
Interestingly, there is one small place where duck-typing is in use in C# today — the foreach operator. Krzysztof Cwalina states that in order to be enumerable by the foreach operator, a class must:
Provide a public method GetEnumerator that takes no parameters and returns a type that has two members: a) a method MoveMext that takes no parameters and return a Boolean, and b) a property Current with a getter that returns an Object.
Notice that he makes no mention of IEnumerable nor IEnumerator. Although it is common to implement these interfaces when creating an enumerable class, if you were to drop the interfaces but leave the implementation, your class would still be enumerable by foreach. Voila! Duck-typing!
But don’t take my word for it. Here’s some demo code to prove it:
public class Program { public static void Main() { foreach (int i in new DuckEnumerable()) Console.WriteLine(i); Console.ReadKey(); } } public class DuckEnumerable // Not IEnumerable { public Duck GetEnumerator() { return new Duck(); } } public class Duck // Not IEnumerator { private int n = 0; public int Current { get { return this.n; } } public bool MoveNext() { return (this.n++ < 10); } }
0 Comments- Add comment Written on 10-Jul-2009 by asquiIt has not been a good week for me with the System.Xml namespace — I’ve found two bugs in two days! Yesterday’s discovery of poor argument validation in XmlDocument.Load(Stream) was a pretty minor point really, a case of nice framework design style; as a contrast, today I uncovered a bona fide bug with XML Serialization in the XmlSerializer class. This is more serious: a case of compliance with the XML specification!
When generating an XML serializer for schemas that feature a free-form xs:any node, the deserialization behaviour is incorrect in some scenarios.
For example, if you want the capability to hold an arbitrary fragment of XML configuration that is specified in a client-specific schema (which is not known up front, and cannot be included in your schema) you might address this by including a “freestyle” configuration element like this.
<xs:element name="config" minOccurs="0"> <xs:complexType mixed="true"> <xs:sequence minOccurs="0"> <xs:any processContents="skip" /> </xs:sequence> </xs:complexType> </xs:element>
If you then use the XmlSerializer to deserialize this configuration, (or the sgen utility, which uses XmlSerializer under the covers) you may run in to the problem detailed below.
Consider the following pair of documents:
Document A
<Root> <FirstChild> <config></config> </FirstChild> <SecondChild/> </Root>
Document B
<Root> <FirstChild> <config/> </FirstChild> <SecondChild/> </Root>
These two documents should be equivalent. The only difference is between <config></config> and <config/>. Just to make sure I’m not going insane, the XML Specification says that “The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag.” So these two documents are slightly different representations of something which should be semantically identical. The output from XML deserialization should be the same for both.
However, this is not the case (at least not in .NET 2.0 SP 2, version 2.0.50727.3053).
Document A is deserialized successfully. The FirstNode and SecondNode elements are non-null in the resulting class instance.
Document B is also deserialized successfully, however the XmlSerializer instance raises its UnknownNode event for the SecondChild node. In the resulting deserialized class instance, the FirstNode is accessible, but the SecondNode element is null!
The reason for this appears to be a bug in the generated serializer class which causes the method responsible for deserializing FirstChild (method Read2_RootFirstChild in the repro solution, linked below) to overrun past the end of the FirstChild element, and consume the SecondChild element from the reader, reporting it as an unrecognised element. (See 2l5qpkfr.0.cs line 251 in the repro solution.)
Subsequently, the method responsible for deserializing SecondChild (method Read1_Object) is unable to deserialize this element because it has already been consumed!
Here is a Visual Studio solution with a standalone repro of the problem, including an annotated copy of the generated serialization classes, showing where I believe the bug to be:
I was unable to find this bug on Microsoft Connect so I am not sure if this issue is known to Microsoft.
Update (12 July 2009): Reported this on Microsoft Connect.
0 Comments- Add comment Written on 09-Jul-2009 by asquiWhilst browsing the .NET 3.5 framework source code yesterday, I came across an odd peculiarity with the XmlDocument class. The XmlDocument.Load(stream) method does not appear to be validating the input stream against a null value. I couldn’t believe it was a real bug until I had reproduced it with the following one-liner:
(new XmlDocument()).Load((Stream)null);
Which results in the following exception:
System.NullReferenceException: Object reference not set to an instance of an object. at System.Xml.XmlReader.CalcBufferSize(Stream input) at System.Xml.XmlTextReaderImpl.InitStreamInput(Uri baseUri, String baseUriStr, Stream stream, Byte[] bytes, Int32 byteCount, Encoding encoding) at System.Xml.XmlTextReaderImpl.InitStreamInput(Stream stream, Encoding encoding) at System.Xml.XmlTextReaderImpl..ctor(String url, Stream input, XmlNameTable nt) at System.Xml.XmlTextReader..ctor(Stream input, XmlNameTable nt) at System.Xml.XmlDocument.Load(Stream inStream) at Scratch.XmlDocumentNullCheckTest.CallLoadWithNullStream()
You can see why I was unsure of myself — the rogue null stream drills in through five levels of functions before reaching the innocent-enough CalcBufferSize() method, which asks if the null stream if it CanSeek.
What XmlDocument.Load(Stream) should really do is to trap my null input right at the front door, and throw the specialised exception which exists for this exact scenario, System.ArgumentNullException, instead of exposing its implementation internals to my rogue input value.
The first thing XmlDocument.Load(Stream) does is to delegate the task to the XmlTextReader class. Since XmlTextReader is a public class, and its (Stream, XmlNameTable) constructor is also public, then XmlTextReader is also in violation. (Perhaps the author of XmlDocument.Load() felt it acceptable to not validate for a null stream because they were relying on the XmlTextReader constructor doing that check for them?)
[ These problems exist in System.Xml 2.0.0.0 (2.0.50727.4918) but have probably been fixed in the version that ships as part of .NET 4.0. ]
The other reason I refused to believe this could be true until I had reproduced it is because I know that there is a static code analysis rule built in for this exact scenario: CA1062: Validate arguments of public methods. This rule should trigger for both XmlDocument.Load(Stream) and XmlTextReader..ctor(Stream, XmlNameTable) to remind the developer to validate any reference arguments passed in to externally visible methods.
I tried to validate the behaviour of this rule when I was reminded that this static analysis rule (and a few others) were actually removed in Visual Studio 2008! One of the buggy and ill-performing static analysis engines was cut loose for VS2008 (along with the rules that depended upon it) as part of a longer-term strategic move to write a new data flow analysis engine based on MSR’s “Phoenix”.
The rules that were removed for 2008 were reinstated in the Visual Studio 2010 September CTP. When they talk about “8 New Data Flow rules” in the September CTP, I guess they really mean “8 old rules from VS2005, that were removed in VS 2008, are now back” — I confirmed this by correlating the new analysis rules listed in the VS2010 September CTP Walkthroughs document with the list of rules removed in VS 2008 due to removal of the data flow engine.
I was hoping that some new rules would also pop up in the Beta, but there’s been no mention of this on the Code Analysis Team Blog so maybe they only managed to reinstate the old rules that were removed for now. This is understandable, considering they had to re-write the data flow analysis engine.
Update (12 July 2009): Reported on Microsoft Connect.
Update (16 July 2009): This can’t be fixed because it might break backward compatibility for applications that are already trapping the NullReferenceException. Ah, one of the joys of strict framework versioning: Unfixable bugs!
3 Comments- Add comment Written on 21-May-2009 by asquiInvoking a batch file from an MSBuild script (such as any *.csproj file) is a snap with the standard Exec build task. However, I recently discovered a little caveat with this, and it took a little digging to get to the bottom of it.
Consider the following MSBuild project file, which invokes a batch script:
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">If your batch script does anything even moderately complicated, it is reasonable that you would want failure of the batch script to cause failure of the MSBuild project. You can do this using the handy EXIT /B command to return from your batch script with an error code:
<Target Name="Default">
<Exec Command="batch.cmd" />
</Target>
</Project>
@ECHO OFF
ECHO Doing stuff
ECHO Failed...
EXIT /B 42
The only problem is that the MSBuild script above will merrily announce that the build succeeded, prompting you to scratch your head a little and wonder.
You might, as I did, try it without the /B switch, and you'd see that it works — MSBuild traps the non-zero return and fails the build. And if you're the sort of person that doesn't ask many questions this might suffice. Only you'd be left with a rather rude batch file.
Let's review the documentation for the EXIT command:
C:\>exit /?
Quits the CMD.EXE program (command interpreter) or the current batch script.
EXIT [/B] [exitCode]
/B specifies to exit the current batch script instead of
CMD.EXE. If executed from outside a batch script, it
will quit CMD.EXE
exitCode specifies a numeric number. if /B is specified, sets
ERRORLEVEL that number. If quitting CMD.EXE, sets the process
exit code with that number.
So the /B stands for "Behave". If you call EXIT 1 in the middle of your batch you immediately and unconditionally cause the command shell to exit with that error code.
So we want to be good, and have our script Behave, so we put back that /B switch, and scratch our head some more about why MSBuild is missing this.
Then we get bored of scratching and bust out .NET Reflector to go to town on the Exec task.
So it turns out that internally, Exec doesn't just call your command directly. Instead it wraps it in a batch script of its own! It generates a batch in your temp directory that looks something like this:
setlocal
set errorlevel=dummy
set errorlevel=
batch.cmd
exit %errorlevel%
Then it invokes that with CMD /C and quickly deletes the temporary file. I'm not quite sure why it goes through the trouble of this intermediate batch file, to be honest; what extra value does it add? Wouldn't CMD /C batch.cmd give the same outcome?
One thing that it does add is a little silent caveat. If you invoke a batch file from within another batch file, the first batch file will never resume execution unless you use the CALL command to invoke the second batch.
So what is happening here is that, because we had no idea of this undocumented batch file being created in the background by the Exec task, our command of "batch.cmd" is being plonked in the middle of this generated batch file, and then rudely prevents the generated batch from explicitly calling EXIT to return the error code to MSBuild!
The solution? If you're going to call a batch file with the Exec task, prefix it with a "CALL".
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Target Name="Default">
<Exec Command="CALL batch.cmd" />
</Target>
</Project>
This will ensure that the outer script runs to completion, and exits CMD returning the appropriate errorlevel value.
(Alternatively, you could upgrade to a suitabley recent operating system, and forget everything you've just read. Somewhere between Windows XP and Windows 7, CMD.EXE has acquired the elegance of picking up the latest errorlevel value and using that as its return code, so that even if that last EXIT command isn't executed, it will still pick up the value set by the inner batch script.)
0 Comments- Add comment Written on 12-May-2009 by asquiA day after my post on .NET Event invocation thread safety, a similar question came up on StackOverflow. About a week after, Eric Lippert came out with this blog post that discusses a related problem that is commonly rolled in to the confusion.
Here are my summary learnings:
The main thrust of the StackOverflow question is now:
Why is explicit-null-check the "standard pattern"? The alternative, assigning the empty delegate, requires only
= delegate {}to be added to the event declaration, and this eliminates those little piles of stinky ceremony from every place where the event is raised. It would be easy to make sure that the empty delegate is cheap to instantiate. Or am I still missing something?
I still have some reservations about this, and suspect that the likes of Joe Duffy might have something to say about the performance impact of this. Although my performance analysis certainly suggests that the invocation-time overhead of this is, in real terms, negligible, I haven’t done any testing on the initialisation-time overheads. Still, the impact of this is likely to be negligible, and as Earwicker points out “Why let the ugly way be the recommended way? If we wanted premature optimisation instead of clarity, we'd be using assembler”
So, should we all be replacing the copy-and-null-check pattern with the initialise-with-empty-delegate pattern?
2 Comments- Add comment Written on 23-Apr-2009 by asquiIn CLR Via C# Richter points out a few subtle points about event invocation in multi-threaded classes:
The Naive Approach: Not thread safe
So consider this invocation code which raises an event:
public static event EventHandler<EventArgs> NonThreadSafeEvent;
public static void OnNonThreadSafeEvent(EventArgs e)
{
if (NonThreadSafeEvent != null)
{
// Event could still become null in this interim,
// after the check but before the invocation
NonThreadSafeEvent(null, e);
}
}
On a class that is being accessed by multiple threads, this could lead to a NullReferenceException, despite the well-intentioned null guard.
There are a number of ways to overcome this problem and guarantee that multi-threaded classes will never be exposed to the risk of a NullReferenceException when attempting to invoke a delegate or event.
The Classic Solution: Take a copy
This is the solution that Richter proposes to achieve thread safety:
public static event EventHandler<EventArgs> ClassicNullCheckedEvent;
public static void OnClassicNullCheckedEvent(EventArgs e)
{
EventHandler<EventArgs> localCopy = ClassicNullCheckedEvent;
if (localCopy != null)
{
// Nobody can change our local copy so we're sure it's not null
localCopy(null, e);
}
}
New-Age Solution: Pre-Initialise
I like to call this the Juval Löwy solution, because he proposes it in one of his books:
“You can ensure that the internal invocation list always has at least one member by initializing it with a do-nothing anonymous method. Because no external party can have a reference to the anonymous method, no external party can remove the method, so the delegate will never be null”
— Programming .NET Components, 2nd Edition, by Juval Löwy
public static event EventHandler<EventArgs> PreInitializedEvent = delegate { };
public static void OnPreInitializedEvent(EventArgs e)
{
// No check required - event will never be null because
// we have subscribed an empty anonymous delegate which
// can never be unsubscribed. (But causes some overhead.)
PreInitializedEvent(null, e);
}
Of course then he immediately says that “initializing all delegates this way is impractical” yet without explaining why it is impractical. Seems fine to me! Certainly more practical than remembering to copy-and-check-for-null every time you want to raise an event.
As always, there are some subtle performance implications to each approach (particularly the last one!)
Executing 50000000 iterations . . .
OnNonThreadSafeEvent took: 432ms OnClassicNullCheckedEvent took: 490ms OnPreInitializedEvent took: 614ms Subscribing an empty delegate to each event . . . Executing 50000000 iterations . . . OnNonThreadSafeEvent took: 674ms OnClassicNullCheckedEvent took: 674ms OnPreInitializedEvent took: 2041ms Subscribing another empty delegate to each event . . . Executing 50000000 iterations . . . OnNonThreadSafeEvent took: 2011ms OnClassicNullCheckedEvent took: 2061ms OnPreInitializedEvent took: 2246ms Done
Though you probably needn’t worry about these until your performance testing turns up a bottleneck on event invocation. i.e. probably never. (Note that the test run is for 50 million iterations.)
(Code samples from this post are available as a VS2008 Solution.)
0 Comments- Add comment Written on 19-Apr-2009 by asquiLast week Bart De Smet posted an excellent explanation of co-variance and contra-variance support in the .NET CLR and the new support for this in C# 4.0 that can make our lives easier.
This is by far the most comprehensive and clear explanation of the rather in-depth field of type variance, and features lots of examples (and diagrams!) to help your understanding.
My favourite part of the article was this awesome fruit farmer metaphor:
"This might go unnoticed if the farmer doesn’t enforce runtime fruit/vegetable type safety."
— Bart De Smet
Read the full article here:
C# 4.0 FEATURE FOCUS – PART 4 – CO- AND CONTRA-VARIANCE FOR GENERIC DELEGATE AND INTERFACE TYPES
3 Comments- Add comment Written on 03-Mar-2009 by asquiVersioning of assemblies in .NET can be a confusing prospect given that there are currently at least three ways to specify a version for your assembly.
Here are the three main version-related assembly attributes:
// Assembly mscorlib, Version 2.0.0.0
[assembly: AssemblyFileVersion("2.0.50727.3521")]
[assembly: AssemblyInformationalVersion("2.0.50727.3521")]
[assembly: AssemblyVersion("2.0.0.0")]
By convention, the four parts of the version are referred to as the Major Version, Minor Version, Build, and Revision.
Typically you’ll manually set the Major and Minor AssemblyFileVersion to reflect the version of the assembly, then increment the Build and/or Revision every time your build system compiles the assembly. The AssemblyFileVersion should allow you to uniquely identify a build of the assembly, so that you can use it as a starting point for debugging any problems.
On my current project we have the build server encode the changelist number from our source control repository into the Build and Revision parts of the AssemblyFileVersion. This allows us to map directly from an assembly to its source code, for any assembly generated by the build server (without having to use labels or branches in source control, or manually keeping any records of released versions).
This version number is stored in the Win32 version resource and can be seen when viewing the Windows Explorer property pages for the assembly.
The CLR does not care about nor examine the AssemblyFileVersion.
The AssemblyInformationalVersion is intended to allow coherent versioning of the entire product, which may consist of many assemblies that are independently versioned, perhaps with differing versioning policies, and potentially developed by disparate teams.
“For example, version 2.0 of a product might contain several assemblies; one of these assemblies is marked as version 1.0 since it’s a new assembly that didn’t ship in version 1.0 of the same product. Typically, you set the major and minor parts of this version number to represent the public version of your product. Then you increment the build and revision parts each time you package a complete product with all its assemblies.”
— Jeffrey Richter, CLR via C# (Second Edition) p. 57
The CLR does not care about nor examine the AssemblyInformationalVersion.
The AssemblyVersion is used by the CLR to bind to strongly named assemblies. It is stored in the AssemblyDef manifest metadata table of the built assembly, and in the AssemblyRef table of any assembly that references it.
This is very important, because it means that when you reference a strongly named assembly, you are tightly bound to a specific AssemblyVersion of that assembly. The entire AssemblyVersion must be an exact match for the binding to succeed. For example, if you reference version 1.0.0.0 of a strongly named assembly at build-time, but only version 1.0.0.1 of that assembly is available at runtime, binding will fail! (You will then have to work around this using Assembly Binding Redirection.)
There is a little confusion around whether the entire AssemblyVersion has to be an exact match in order for an assembly to be loaded. Some people are under the false belief that only the Major and Minor parts of the AssemblyVersion have to match in order for binding to succeed. This is a sensible assumption, however it is ultimately incorrect (as of .NET 3.5), and it’s trivial to verify this for your version of the CLR. Just execute this sample code.
On my machine the second assembly load fails, and the last two lines of the fusion log make it perfectly clear why:
.NET Framework Version: 2.0.50727.3521
---
Attempting to load assembly: Rhino.Mocks, Version=3.5.0.1337, Culture=neutral, PublicKeyToken=0b3305902db7183f
Successfully loaded assembly: Rhino.Mocks, Version=3.5.0.1337, Culture=neutral, PublicKeyToken=0b3305902db7183f
---
Attempting to load assembly: Rhino.Mocks, Version=3.5.0.1336, Culture=neutral, PublicKeyToken=0b3305902db7183f
Assembly binding forfailed:
System.IO.FileLoadException: Could not load file or assembly 'Rhino.Mocks, Version=3.5.0.1336, Culture=neutral,
PublicKeyToken=0b3305902db7183f' or one of its dependencies. The located assembly's manifest definition
does not match the assembly reference. (Exception from HRESULT: 0x80131040)
File name: 'Rhino.Mocks, Version=3.5.0.1336, Culture=neutral, PublicKeyToken=0b3305902db7183f'
=== Pre-bind state information ===
LOG: User = Phoenix\Dani
LOG: DisplayName = Rhino.Mocks, Version=3.5.0.1336, Culture=neutral, PublicKeyToken=0b3305902db7183f
(Fully-specified)
LOG: Appbase = [...]
LOG: Initial PrivatePath = NULL
Calling assembly : AssemblyBinding, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null.
===
LOG: This bind starts in default load context.
LOG: No application configuration file found.
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework64\v2.0.50727\config\machine.config.
LOG: Post-policy reference: Rhino.Mocks, Version=3.5.0.1336, Culture=neutral, PublicKeyToken=0b3305902db7183f
LOG: Attempting download of new URL [...].
WRN: Comparing the assembly name resulted in the mismatch: Revision Number
ERR: Failed to complete setup of assembly (hr = 0x80131040). Probing terminated.
I think the source of this confusion is probably because Microsoft originally intended to be a little more lenient on this strict matching of the full AssemblyVersion, by matching only on the Major and Minor version parts:
“When loading an assembly, the CLR will automatically find the latest installed servicing version that matches the major/minor version of the assembly being requested.”
— Jeffrey Richter, CLR via C# (Second Edition) p. 56
This was the behaviour in Beta 1 of the 1.0 CLR, however this feature was removed before the 1.0 release, and hasn’t managed to re-surface in .NET 2.0:
“Note: I have just described how you should think of version numbers. Unfortunately, the CLR doesn’t treat version numbers this way. [In .NET 2.0], the CLR treats a version number as an opaque value, and if an assembly depends on version 1.2.3.4 of another assembly, the CLR tries to load version 1.2.3.4 only (unless a binding redirection is in place). However, Microsoft has plans to change the CLR’s loader in a future version so that it loads the latest build/revision for a given major/minor version of an assembly. For example, on a future version of the CLR, if the loader is trying to find version 1.2.3.4 of an assembly and version 1.2.5.0 exists, the loader with automatically pick up the latest servicing version. This will be a very welcome change to the CLR’s loader — I for one can’t wait.”
— Jeffrey Richter, CLR via C# (Second Edition) p. 164 (Emphasis mine)
As this change still hasn’t been implemented, I think it’s safe to assume that Microsoft had back-tracked on this intent, and it is perhaps too late to change this now. I tried to search around the web to find out what happened with these plans, but I couldn’t find any answers. I still wanted to get to the bottom of it.
So I emailed Jeff Richter and asked him directly — I figured if anyone knew what happened, it would be him.
He replied within 12 hours, on a Saturday morning no less, and clarified that the .NET 1.0 Beta 1 loader did implement this ‘automatic roll-forward’ mechanism of picking up the latest available Build and Revision of an assembly, but this behaviour was reverted before .NET 1.0 shipped. It was later intended to revive this but it didn’t make it in before the CLR 2.0 shipped. Then came Silverlight, which took priority for the CLR team, so this functionality got delayed further. In the meantime, most of the people who were around in the days of CLR 1.0 Beta 1 have since moved on, so it’s unlikely that this will see the light of day, despite all the hard work that had already been put into it.
The current behaviour, it seems, is here to stay.
It is also worth noting from my discussion with Jeff that AssemblyFileVersion was only added after the removal of the ‘automatic roll-forward’ mechanism — because after 1.0 Beta 1, any change to the AssemblyVersion was a breaking change for your customers, there was then nowhere to safely store your build number. AssemblyFileVersion is that safe haven, since it’s never automatically examined by the CLR. Maybe it’s clearer that way, having two separate version numbers, with separate meanings, rather than trying to make that separation between the Major/Minor (breaking) and the Build/Revision (non-breaking) parts of the AssemblyVersion.
The moral is that if you’re shipping assemblies that other developers are going to be referencing, you need to be extremely careful about when you do (and don’t) change the AssemblyVersion of those assemblies. Any changes to the AssemblyVersion will mean that application developers will either have to re-compile against the new version (to update those AssemblyRef entries) or use assembly binding redirects to manually override the binding.
Just take another look at the version attributes on mscorlib:
// Assembly mscorlib, Version 2.0.0.0
[assembly: AssemblyFileVersion("2.0.50727.3521")]
[assembly: AssemblyInformationalVersion("2.0.50727.3521")]
[assembly: AssemblyVersion("2.0.0.0")]
Note that it’s the AssemblyFileVersion that contains all the interesting servicing information (it’s the Revision part of this version that tells you what Service Pack you’re on), meanwhile the AssemblyVersion is fixed at a boring old 2.0.0.0. Any change to the AssemblyVersion would force every .NET application referencing mscorlib.dll to re-compile against the new version!
1 Comment- Add comment Written on 14-Feb-2009 by asqui__79__(@0x500).jpg)
0 Comments- Add comment Written on 10-Feb-2009 by asquiIf you define your own .NET Exception type (derived from System.Exception) it is important to remember that this custom exception type will not be serializable by default.
It is often necessary for Exceptions be serializable because without this they will not travel across remoting boundaries, or even between application domains within the same process.
It is the [Serializable] attribute that indicates if your type is serializable. Although System.Exception is decorated with the [Serializable] attribute, this attribute will not be inherited by your custom exception class. The reason for this is clear when we open up Reflector and look at the declaration of SerializableAttribute:
[ComVisible(true)]
[AttributeUsage(
AttributeTargets.Delegate | AttributeTargets.Enum |
AttributeTargets.Struct | AttributeTargets.Class,
Inherited=false)] // Not inherited!
public sealed class SerializableAttribute : Attribute
{ ... }
This means that any custom exception types you create must be explicitly decorated with the [Serializable] attribute.
The default serialization behaviour (which you get for free when you just decorate your class with [Serializable]) is to serialize all public and private fields in the type. This is good enough for many cases, but System.Exception needs to do some special things when being serialized and de-serialized. (For example, populate the lazily instantiated stack trace, which may not have been generated if the default serializer tries to just reach in and grab the field values through reflection.)
Because System.Exception implements the ISerializable interface, by inheritance, your derived exception class also necessarily implements this interface. This means you cannot rely on any default serialization behaviour.
You have to play along with the full ISerializable pattern. This involves two things:
Here is an example custom exception which defines additional properties and serializes them correctly.
[Serializable]
// Attribute is NOT inherited from Exception and MUST be specified.
public class SerializableException : Exception
{
private readonly string resourceName;
private readonly IList<string> validationErrors;
public SerializableException()
{
}
public SerializableException(string message)
: base(message)
{
}
public SerializableException(string message, Exception inner)
: base(message, inner)
{
}
public SerializableException(string message, string resourceName, IList<string> validationErrors)
: base(message)
{
this.resourceName = resourceName;
this.validationErrors = validationErrors;
}
public SerializableException(string message, string resourceName, IList<string> validationErrors, Exception inner)
: base(message, inner)
{
this.resourceName = resourceName;
this.validationErrors = validationErrors;
}
[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)]
// Protected for unsealed classes, private for sealed.
protected SerializableException(SerializationInfo info, StreamingContext context)
: base(info, context)
{
this.resourceName = info.GetString("ResourceName");
this.validationErrors = (IList<string>)info.GetValue("ValidationErrors", typeof(IList<string>));
}
public string ResourceName
{
get { return this.resourceName; }
}
public IList<string> ValidationErrors
{
get { return this.validationErrors; }
}
[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)]
public override void GetObjectData(SerializationInfo info, StreamingContext context)
{
if (info == null)
{
throw new ArgumentNullException("info");
}
info.AddValue("ResourceName", this.ResourceName);
// Note: if "List<T>" isn't serializable you may need to work out another
// method of adding your list, this is just for show...
info.AddValue("ValidationErrors", this.ValidationErrors, typeof(IList<string>));
// MUST call through to the base class to let it save its own state
base.GetObjectData(info, context);
}
}
You can also download a VS2008 project with a more complete set of examples, including unit tests.
(This post was based on my Stack Overflow question, “What is the correct way to make a custom .NET Exception serializable?”)
0 Comments- Add comment Written on 03-Feb-2009 by asqui__12__(@0x300).jpg)
1 Comment- Add comment Written on 22-Jan-2009 by asquiI recently came across a curious behaviour in the Visual Studio Code Analysis feature (formerly known as FxCop).
The description for rule CA1001 is "Types that own disposable fields should be disposable". This is intended to highlight when you've forgotten to implement IDisposable on a class that owns other disposable types.
For example, consider a LogWriter class which internally uses a FileStream to keep track of the file it is writing to. The user of this LogWriter should be able to deterministically close the file that it is being written to. The standard way to do this is to implement IDisposable on LogWriter to dispose of the FileStream. If you don't do this, the user of the LogWriter may be at the mercy of garbage collection, and the FileStream Finalizer, for closure of the log file. This is a Bad Thing because it could (for example) unnecessarily prevent that file being opened by someone else
What's odd is that this rule was triggering in a rather unexpected scenario:
using System;
public sealed class CA1001Repro
{
private object notUsed;
public static void Main()
{
using (Disposable disposable = new Disposable())
{
HigherOrder(a => disposable.DoSomething());
}
}
private static void HigherOrder(Action a)
{
throw new NotImplementedException();
}
}
internal class Disposable : IDisposable
{
public void Dispose()
{
throw new System.NotImplementedException();
}
public void DoSomething()
{
throw new NotImplementedException();
}
}
With code analysis rule CA1001 enabled, this generates the following build output:
CA1001 : Microsoft.Design : Implement IDisposable on 'CA1001Repro' because it creates members of the following IDisposable types: 'DisposableClass'.
Odd isn’t it? It’s telling me quite clearly that the class CA1001Repro creates a member of the type DisposableClass but nothing of this sort is true! The only place I create a DisposableClass instance is within the using block in Main(). The only field member of the class is an object called notUsed.
I presume this error is related to the closure class, which is generated by the syntactic sugar of C# anonymous methods, and allows my inline delegate to seamlessly gain access to the disposableClass variable which is in scope. (See also)
With Reflector we can see this generated class, and sure enough it does have a member of type DisposableClass, and it is also not IDisposable:
[CompilerGenerated]
private sealed class <>c__DisplayClass2
{
// Fields
public DisposableClass disposableClass;
// Methods
public <>c__DisplayClass2();
public void b__0(string a);
}
However, if this were the root of the problem then why did CA1001 explicitly name ‘CA1001Repro’ as the class at fault? Why didn’t it name the true culprit, ‘CA1001Repro.<>c__DisplayClass2’?
And as a final twist of weirdness, if you remove (or comment out) the unused field ‘notUsed’, this problem goes away entirely!
I’ve seen this problem both with Visual Studio 2005 and 2008, and I’ve not been able to find it mentioned anywhere as a known bug with Code Analysis.
Download the repro project here and see if you get the same results. Let me know if you can work out what is going on.