Archive

Archive for the ‘C#’ Category

Testing Code with the Mono Csharp REPL

March 21, 2012 2 comments

I’m porting a full enterprise application to Mono so it can be deployed on Linux as requested by customers. I’m using the term “porting” loosely, because really, I just need to verify the existing code will run under Mono and make some minor adjustments so everything runs cross-platform. The Mono csharp read-evaluate-print-loop (REPL) has been an invaluable tool in this process.

Take Active Directory authentication for example. This is one of the areas that the Mono framework is not compatible with the .NET framework. You can see this right away in the csharp REPL, and then work out a quick alternative with the lower level DirectoryServices libraries. Just type ‘csharp’ at the terminal on a Linux system with Mono installed:

csharp> LoadAssembly("System.DirectoryServices.AccountManagement");
error CS0006: Metadata file `System.DirectoryServices.AccountManagement' could not be found

No worries – the AccountManagement API’s aren’t there on Linux – it’s good to be able to find this out right away (you can also check the Mono GAC, but this is REPL school). Now we just need to verify that a lower level call will work:

csharp> LoadAssembly("System.DirectoryServices");
csharp> DirectoryEntry adsEntry = new DirectoryEntry("LDAP://10.0.1.22/dc=mydomain,dc=local", "mydomain\\testUser", "testPa55w0rd");
csharp> var searcher = new DirectorySearcher(adsEntry);
csharp> var res = searcher.FindOne();
csharp> res.Properties;

It’s great to be able to do some quick API checks to understand and isolate compatibility issues right away. Go Mono REPL!

Advertisements
Categories: C#

Loading XML into MongoDB

December 29, 2011 Leave a comment

I’m starting a new app today and building out the data layer with MongoDB as my database. The app uses a collection from the USDA, that I thought makes a good sample for getting started with the “Load” portion of ETL into MongoDB.

The data is available from the USDA here – the raw XML for MyPyramid: http://explore.data.gov/download/b978-7txq/XML

Step 1 – Define a Class for the data

Although not absolutely necessary as you could build a raw BSON document directly from XML, you kind of miss out on some of the C# driver’s niceties if you do. Looking at the raw data, I came up with this class, along with a constructor that takes an XElement to handle the parsing. Strict DTO people might move that parsing to a function within the ETL process…up to you. The only MongoDB specific code here is the BsonId attribute, which I’ll put on the FoodCode property – a unique ID from the source system.

public class Food {
	[BsonId]
	public int FoodCode {get;set;}
	public string DisplayName {get;set;}
	public float PortionDefault {get;set;}
	public float PortionAmount {get;set;}
	public string PortionDisplayName {get;set;}
	public float Factor {get;set;}
	public float Increment {get;set;}
	public float Multiplier {get;set;}
	public float Grains {get;set;}
	public float WholeGrains {get;set;}
	public float Vegetables {get;set;}
	public float OrangeVegetables {get;set;}
	public float DarkGreenVegetables {get;set;}
	public float StarchyVegetables {get;set;}
	public float OtherVegetables {get;set;}
	public float Fruits {get;set;}
	public float Milk {get;set;}
	public float Meats {get;set;}
	public float Soy {get;set;}
	public float DryBeansPeas {get;set;}
	public float Oils {get;set;}
	public float SolidFats {get;set;}
	public float AddedSugars {get;set;}
	public float Alcohol {get;set;}
	public float Calories {get;set;}
	public float SaturatedFats {get;set;}
	
	public Food(XElement elem) {
		this.FoodCode = Int32.Parse(elem.Element("Food_Code").Value);
		this.DisplayName = elem.Element("Display_Name").Value;
		this.PortionDefault = float.Parse (elem.Element("Portion_Default").Value);
		this.PortionAmount = float.Parse (elem.Element("Portion_Amount").Value);
		this.PortionDisplayName = elem.Element("Portion_Display_Name").Value;
		if(elem.Element ("Factor") != null)
			this.Factor = float.Parse (elem.Element("Factor").Value);
		this.Increment = float.Parse (elem.Element("Increment").Value);
		this.Multiplier = float.Parse (elem.Element("Multiplier").Value);
		this.Grains = float.Parse (elem.Element("Grains").Value);
		this.WholeGrains = float.Parse (elem.Element("Whole_Grains").Value);
		this.Vegetables = float.Parse (elem.Element("Vegetables").Value);
		this.OrangeVegetables = float.Parse (elem.Element("Orange_Vegetables").Value);
		this.DarkGreenVegetables = float.Parse (elem.Element("Drkgreen_Vegetables").Value);
		this.StarchyVegetables = float.Parse (elem.Element("Starchy_vegetables").Value);
		this.OtherVegetables = float.Parse (elem.Element("Other_Vegetables").Value);
		this.Fruits = float.Parse (elem.Element("Fruits").Value);
		this.Milk = float.Parse (elem.Element("Milk").Value);
		this.Meats = float.Parse (elem.Element("Meats").Value);
		this.Soy = float.Parse (elem.Element("Soy").Value);
		this.DryBeansPeas = float.Parse (elem.Element("Drybeans_Peas").Value);
		this.Oils = float.Parse (elem.Element("Oils").Value);
		this.SolidFats = float.Parse (elem.Element("Solid_Fats").Value);
		this.AddedSugars = float.Parse (elem.Element("Added_Sugars").Value);
		this.Alcohol = float.Parse (elem.Element("Alcohol").Value);
		this.Calories = float.Parse (elem.Element("Calories").Value);
		this.SaturatedFats = float.Parse (elem.Element("Saturated_Fats").Value);
	}
}

You might notice I’m using float for my decimal values. That’s all the accuracy I need, but it does lose some precision. I’m rounding the data when I use it so it won’t really matter, but if your needs differ, choose a different numeric type.

Step 2 – Function for reading the XML file

This is a pretty small data file, only about 750 records, but loading it all into memory at once is a waste. I want to load the “Food_Display_Row” XML elements one at a time, convert to a Food object, store in MongoDB, and move on to the next. It’s a job for a streaming API and an iterator, powered by “yield return” to get one XElement at a time:

static IEnumerable<XElement> readElementStream(string fileName, string elementName) {
	using(var reader = XmlReader.Create(fileName)) {
		reader.MoveToContent();
		while(reader.Read()) {
			if(reader.NodeType == XmlNodeType.Element && reader.Name == elementName) {
				var e = XElement.ReadFrom (reader) as XElement;
				yield return e;
			}
		}
		reader.Close ();
	}
}

Step 3 – Pull it all together and load the data

With the pieces in place, the load process is pretty simple. Connect to the server, get the database (MongoDB creates it on first use), get the collection (MongoDB also creates the collection), and use the iterator to read the XML file, load each element into a Food object and insert into the MongoDB collection. At the end, we have a MongoDB database with a collection of data from the food guide pyramid.

var server = MongoServer.Create ("mongodb://localhost");
var db = server.GetDatabase("gov");
var foods = db.GetCollection<Food>("food");

foreach(var elem in readElementStream("~/Downloads/MyFoodapediaData/Food_Display_Table.xml", "Food_Display_Row")) {
	var food = new Food(elem);
	foods.Insert (food);
}

My favorite thing about this is that I never had to leave C# to create the database, parse the source XML, or load the data. I don’t have to run a separate ETL process or use management tools to configure my database schema. It’s a simple, self-contained solution.

My second favorite part is that I ran all of this under Ubuntu and Mono. It should work just as well under Windows and .NET, but life is better running under an open source software stack.

I hope you find this helpful if you’re getting started with MongoDB and want a little data to play with.

Categories: C#, MongoDB Tags: ,

If C# is so awesome, why use anything else?

December 29, 2011 20 comments

Anyone who knows me professionally knows I work in C# most of the time. I think it’s a great language that’s been well designed and made very portable by way of being an open language specification. A lot of people look at C# and say, that’s just Java with some Microsoft-extensions. Sort of, since it’s framework (.NET) ships with quite a few libraries that interoperate well with Windows, although the C# language itself doesn’t have anything to do with Windows, and runs on Linux, OS X, Android, iOS, and so on. In my opinion, Java has stagnated over the years, while C# has been evolving with generics (which Java followed), lambdas (Java finally gets them years later), anonymous types, partial method and class declarations, language integrated query (LINQ), dynamic runtime integration, and soon a simplified asynchronous programming model with await and async that will allow the runtime to deal with the gory details of async programming rather than forcing the programmer to understand and properly implement callbacks and cleanup. Java isn’t catching up fast, so Scala is filling the gaps, but C# remains years ahead.

Every year, my family looks at me funny when I they give me new books. That’s right, I’m a geek that reads computer books. Most of these books are not on C#, but on JavaScript, Python, Haskell, and I even keep an old PERL book on my shelf. What is all this other stuff?

JavaScript – it’s pretty rare these days that other developers would say, “why would you ever want to write JavaScript?” It’s a ubiquitous language amongst web browsers, and it’s pretty rare that anyone can write much of a browser-based application at all without it. Besides, the latest trend is to write a “language X to JavaScript converter” and what good would that be if I didn’t know JavaScript and wasn’t willing to learn language X? There have always been some nice server-side implementations, like Spidermonkey and newer V8, powering trendy applications like MongoDB and Node.js. Until I started down the Python path, whenever I needed extensibility, I would embed Spidermonkey for some JavaScript fun.

Python – in the realm of C#, a lot of people are uncomfortable mixing in Python. They don’t like the idea of losing compile-time checks and worry about needing a myriad of Python frameworks to solve any sizable development tasks. However, Python is an excellent tool for large and small projects alike, and IronPython take the Python language and gives it access to the full .NET framework. In the last few years, I’ve felt constrained if I didn’t have a layer of extensibility that IronPython can add to CLR applications. Python scripts let you treat code as data, meaning you can store it, transmit it, and change it at runtime. Python gives you a new way to move the problem around, solve it at a different time in your overall solution. It’s a great piece of the toolbox.

I remember spending weeks building business rules engines so non-programmers could add some logic to enterprise applications. These engines would use reflection and Lightweight Code Generation (LCG) and a clumsy UI where end users would select data objects and operators and build expression statements. IronPython uses LCG, is highly optimized, and gives you a general purpose scripting language with access to CLR objects. Most end users prefer the ability to write an expression in script rather than fumble with the type of UI needed to build an expression tree. This is just scratching the surface of the Python language, but at the very least, it’s a great tool anywhere you want to offer runtime extensibility.

Haskell is pure functional programming – no state, just functions. I used F# a bit for professional work just to learn it, but it allows you to fall back into the OOP line of thinking. Haskell makes you take a fully functional approach. I recommend every developer that’s looking to expand their approach to problem solving to spend some time with it.

What about that PERL 5 book? Well, I don’t use that, to be honest. I did once upon a time, but I really do avoid PERL at all costs. Maybe one day, I’ll pick it back up.

Categories: C#, IronPython Tags: , ,

Calling IronPython from C#

September 16, 2011 2 comments

There are a lot of great Python libraries out there, and IronPython makes it really easy to call many of them from .NET. Over the years, IronPython has become easier to embed in your applications, and the DLR that was added in .NET 4 makes it dead simple.

Say you have a Python expression (could be an entire module) in a string variable called “expression” – the code to execute that is this simple:

var engine = IronPython.Hosting.Python.CreateEngine();
var script = engine.CreateScriptSourceFromString(expression);
var scope = engine.CreateScope();
dynamic result = script.Execute(scope);

When you execute that, your result will be whatever you returned from the Python expression. You could return a value or a function defined in Python. If your expression is a Python lambda taking three parameters, from your C# code you can write the following:

dynamic foo = result(a,b,c);

If you need to load additional .NET assemblies to expose them to the IronPython code, just call the following:

engine.Runtime.LoadAssembly(assembly);

What about passing parameters? The scope let’s you pass in a dictionary of parameters. The key to each dictionary entry is the name the parameter will have inside the IronPython scope, and the value is going to be the value of that parameter when script.Execute(scope) is called. To pass a dictionary of parameters, simply do this:

var parameters = new Dictionary<string,object>() {
   { "age", 30 }, { "name", "Vinny" }
}
scope = engine.CreateScope(parameters);
result = script.Execute(scope);

The parameters “age” and “name” will be passed into the scope of the IronPython script being executed.

Suppose you have additional Python modules that you want to call from your embedded IronPython. IronPython ships with quite a bit of the standard library, but your embedded code doesn’t necessarily know how to find it. A call to engine.SetSearchPaths(paths) adds a collection of strings with paths that IronPython should search when executing your code.

var paths = new List<string>();
paths.Add("c:\path\to\my\modules");
engine.SetSearchPaths(paths);

I encourage you explore the options for embedding IronPython in your own applications. The ScriptEngine is quite robust; you can execute string expressions or entire files, in the same AppDomain or in a new one.

Categories: C#, DLR, IronPython

TCP Proxy in C# using Task Parallel Library

April 27, 2011 9 comments

Every now and then I have the need to proxy TCP communications, handy for things like viewing network traffic or proxying Silverlight or Flash requests. C# makes this pretty easy, and the Task Parallel Library (add-on to .NET 3.5 & shipped with .NET 4) simplifies the code with a nice fluent interface.

Here’s a quick example that works for proxying a VNC connection. There is one task for reading from the client and sending data to the server and another task for reading server responses and sending them to the client.

static TcpListener listener = new TcpListener(IPAddress.Any, 4502);

const int BUFFER_SIZE = 4096;

static void Main(string[] args) {
    listener.Start();
    new Task(() => {
        // Accept clients.
        while (true) {
            var client = listener.AcceptTcpClient();
            new Task(() => {
                // Handle this client.
                var clientStream = client.GetStream();
                TcpClient server = new TcpClient("10.0.1.5", 5900);
                var serverStream = server.GetStream();
                new Task(() => {
                    byte[] message = new byte[BUFFER_SIZE];
                    int clientBytes;
                    while (true) {
                        try {
                            clientBytes = clientStream.Read(message, 0, BUFFER_SIZE);
                        }
                        catch {
                            // Socket error - exit loop.  Client will have to reconnect.
                            break;
                        }
                        if (clientBytes == 0) {
                            // Client disconnected.
                            break;
                        }
                        serverStream.Write(message, 0, clientBytes);
                    }
                    client.Close();
                }).Start();
                new Task(() => {
                    byte[] message = new byte[BUFFER_SIZE];
                    int serverBytes;
                    while (true) {
                        try {
                            serverBytes = serverStream.Read(message, 0, BUFFER_SIZE);
                            clientStream.Write(message, 0, serverBytes);
                        }
                        catch {
                            // Server socket error - exit loop.  Client will have to reconnect.
                            break;
                        }
                        if (serverBytes == 0) {
                            // server disconnected.
                            break;
                        }
                    }
                }).Start();
            }).Start();
        }
    }).Start();
    Debug.WriteLine("Server listening on port 4502.  Press enter to exit.");
    Debug.ReadLine();
    listener.Stop();
}

This is for illustrative purposes only. If you decide to use this in production, you’ll need to use TcpListener.BeginAcceptTcpClient() for async connections, you’ll need error handling and logging, and you’ll want some sort of pool to manage (and clean up) client socket connections. Have fun, and let me know if you have concerns or suggestions.

Categories: C#, Task, tcp, TPL Tags: , , ,