Scroll Top

Introduction to ML.Net

ML.NET empowers developers to incorporate machine learning into .NET applications, whether they operate in connected or disconnected environments. By analyzing data, these applications can generate predictions autonomously, leveraging patterns and trends in the data instead of relying on predefined logic or manual programming.

The starting point for building any ML.NET application is the MLContext class, which serves as the central hub for all machine learning operations. It acts as the foundation of your ML.NET pipeline, providing access to essential components and functionalities such as data loading, preprocessing, model training, evaluation, and prediction.

To begin using ML.NET, you need to create an instance of the MLContext class as follows:
var mlContext = new MLContext();

The following diagram provides an overview of ML.NET’s MLContext class and its associated catalogs, which serve as organized entry points for different machine learning operations and workflows.

Model

Operations with trained models.

Data

Data loading and saving.

Transforms

Data processing operations.

BinaryClassification

Trainers and tasks specific to binary classification problems.

MulticlassClassification

Trainers and tasks specific to multiclass classification problems.

Regression

Trainers and tasks specific to regression problems.

Clustering

Trainers and tasks specific to clustering problems.

Ranking

Trainers and tasks specific to ranking problems.

AnomalyDetection

Trainers and tasks specific to anomaly detection problems.

Forecasting

Trainers and tasks specific to forecasting problems.

ComponentCatalog

This is a catalog of components that will be used for model loading.

ML.NET Workflow

  1. Define Data Schema
    Create classes to define the structure of the input and output data.
  2. Load Data
    Load and prepare your data for training, usually from a file, database, or in-memory collection.
  3. Preprocess and Transform Data
    Define a data processing pipeline that performs operations such as normalization, encoding, or feature extraction.
  4. Train the Model
    Use ML.NET to train a machine learning model using your data pipeline and a chosen algorithm.
  5. Evaluate the Model
    Evaluate the model’s performance on a test dataset to ensure its accuracy.
  6. Use the Model for Predictions
    Save the trained model and load it later to make predictions on new data.
  7. Deploy the Model
    Integrate the model into your application (e.g., web APIs, desktop apps, or cloud).

Example: End-to-End ML.NET Workflow

In this example, we’ll create a regression model to predict housing prices.

1. Define Input and Output Schema

Create two classes to represent input and output data.
HouseData class maps the columns of the input file to the properties

public class HouseData
{
    public float Size { get; set; }
    public float Price { get; set; }
}



public class Prediction
{
    public float Score { get; set; } // Predicted price
}

2. Set Up ML.NET and Load Data

Set up the MLContext, load the dataset, and build a data processing pipeline.

using System;
using Microsoft.ML;
using Microsoft.ML.Data;

class Program
{
    static void Main()
    {
        // 1. Initialize ML.NET environment
        var mlContext = new MLContext();

        // 2. Load data
        string dataPath = "housing-data.csv"; // File with columns: Size, Price
        IDataView dataView = mlContext.Data.LoadFromTextFile<HouseData>(
            path: dataPath,
            separatorChar: ',',
            hasHeader: true
        );

        // 3. Split data into training and testing sets
        var dataSplit = mlContext.Data.TrainTestSplit(dataView, testFraction: 0.2);
        var trainingData = dataSplit.TrainSet;
        var testData = dataSplit.TestSet;

        // 4. Build data processing pipeline and model
        var pipeline = mlContext.Transforms.Concatenate("Features", "Size")
            .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Price", maximumNumberOfIterations: 100));

3. Train the Model

Train the model using the training dataset, using pipeline.Fit()

     // Train the model
        Console.WriteLine("Training the model...");
        var model = pipeline.Fit(trainingData);

4. Evaluate the Model

Evaluate the model’s performance using the test dataset.

      // Evaluate the model
        Console.WriteLine("Evaluating the model...");
        var predictions = model.Transform(testData);
        var metrics = mlContext.Regression.Evaluate(predictions, labelColumnName: "Price");

        Console.WriteLine($"R-Squared: {metrics.RSquared}");
        Console.WriteLine($"Root Mean Squared Error: {metrics.RootMeanSquaredError}");

5. Make Predictions

Use the trained model to make predictions on new data.


        // Use the model for predictions
        var predictionEngine = mlContext.Model.CreatePredictionEngine<HouseData, Prediction>(model);

        var sampleHouse = new HouseData { Size = 1200f };
        var prediction = predictionEngine.Predict(sampleHouse);

        Console.WriteLine($"Predicted Price for house of size {sampleHouse.Size} sq ft: {prediction.Score:C}");

6. Save and Load the Model

Save the model to disk for future use.


      // Save the model
        string modelPath = "house-price-model.zip";
        mlContext.Model.Save(model, trainingData.Schema, modelPath);
        Console.WriteLine($"Model saved to {modelPath}");

You can later load the saved model as follows:

ITransformer loadedModel = mlContext.Model.Load("house-price-model.zip", out var inputSchema);

Output

When you run this program with a sample dataset, you’ll see the following:

  • Model evaluation metrics (e.g., R-Squared, RMSE).
  • Predictions for a new input data point.