Machine Leaning on the .Net Framework with ML.Net

As soon as I was aware of the ML.net framework back in August of 2018, I headed over to their MSDN Tutorial and Github pages to try out some coding on of their sample applications, to get a hang of using this. One must note, however, that this was a time when Microsoft was actively developing on the ML.Net Framework and APIs were bound to change. Now that most of their APIs are stable, one can access many of the samples here. 

My First Application (Taxi Fare Prediction)

With some inspiration from Ride Hailing services like Uber or PickMe (a Sri Lankan ride haling service), I thought of working on a sample Taxi Fair predictor. The taxi-fare predictor will take in inputs such as: Rate Code, Passenger Count, Trip Time (seconds) and Trip Distance to predict the overall cost. Whereas apps like Uber also uses traffic data to predict trip time, hence predict the cost.

The Taxi Fare Dataset

The input data set was based on the historical taxi fares of the New York Taxi service and consisted of: Rate code, Passenger Count, Trip time (seconds), Trip Distance, Payment Type (Cash or Card) and the respective Fare in USD.

The first step in the process was to analyze the data to find out how it can be decoded. Luckily with this data set, most of the data can be decoded as floating-point values (that was the only numerical type that worked with Ml.net), whilst the Payment Type was processed as a string.

The next step was to pre-process the training data, so that it can be fed into the training algorithm for model creation. Here it was important to distinguish between Labels and Features; Labels are the data that one wants to predict, whilst features are the input data that is used for the prediction. So, in this case, the Label was the Taxi Fare in USD itself, and the features were the rest of the input data.

With regards to the features, they have to be transformed in to vectors based on the respective data type. There are also transformation types that allow the coder to Normalize and Scale the feature dataset. More information regarding data transformations can be found here. Once they have been transformed, the features are the concatenated. The features, along with their respective labels are then fed into the training algorithm; in this case it was a Fast Tree regressor. Once the model was created, it can then be used to calculate fare predictions.

The console window of a Taxi Fare prediction session.

Conclusion

ML.net is a great platform if you are getting into machine learning. The following are some great links to get started with ML.net development.

Other articles on my blog:

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.