Quickstart 

To use recsyslearn in a project:

import recsyslearn

To use its features in a project you just need a dataset in the form of user, item, rating or user, item, timestamp.

Segment your dataset 

Before evaluating the fairness of your recommendation system, you may want to segment your dataset in groups of users or items based on one of their features.

To segment your dataset in groups (e.g. based on the item popularity):

import json
import pandas as pd
from recsyslearn.dataset.segmentations import InteractionSegmentation

# Read the entire dataset (in the form user, item, rank)
train_data = pd.read_csv("train_dataset.csv")

# Segment the dataset in two groups based on the item popularity
segmented_items = InteractionSegmentation().segment(train_data, [0.8, 0.2])

# Print the results
print(segmented_items.head())

Evaluate the Accuracy 

To evaluate the accuracy of your recommendation system:

import json
import pandas as pd
from recsyslearn.accuracy.metrics import NDCG
from recsyslearn.dataset.utils import find_relevant_items

# Read the recommendation lists (in the form user, item, rank)
top_k = pd.read_csv("top_k.csv")

# Read the test dataset against you would like to evaluate accuracy
# (in the form user, item, rank)
dataset = pd.read_csv("dataset.csv")

# Find the relevant items for each user in the test dataset
pos_items = find_relevant_items(dataset)

# Evaluate the accuracy with NDCG@5 and NDCG@10
ats = (5, 10)
ndcg_df = NDCG().evaluate(top_k, pos_items, ats)

# Print the results
print(json.dumps({f"NDCG@{at}": ndcg_df[f"NDCG@{at}"].mean() for at in ats}, indent=4))

Evaluate the Beyond Accuracy 

To evaluate the Beyond Accuracy (e.g. Novelty) of your recommendation system:

import json
import pandas as pd
from recsyslearn.beyond_accuracy.metrics import Coverage, Novelty
from recsyslearn.dataset.segmentations import InteractionSegmentation

# Read the entire dataset (in the form user, item, rank)
train_data = pd.read_csv("train_dataset.csv")

# Segment the dataset in two groups based on the item popularity
segmented_items = InteractionSegmentation().segment(train_data, [0.8, 0.2])

# Read the recommendation lists (in the form user, item, rank)
top_k = pd.read_csv("top_k.csv")

# Merge the recommendation lists with the item groups
top_k_with_item_groups = top_k.merge(segmented_items, on="item")

# Evaluate the Novelty
novelty = Novelty().evaluate(top_k_with_item_groups)

# Print the results
print(json.dumps({"novelty": novelty}, indent=4))

Evaluate the Fairness 

To evaluate the fairness (e.g. Kullback-Leibler Divergence) of your recommendation system:

import json
import pandas as pd
from recsyslearn.dataset.segmentations import (
    ActivitySegmentation,
    InteractionSegmentation,
)
from recsyslearn.fairness.metrics import KullbackLeibler

# Read the entire dataset (in the form user, item, rank)
train_data = pd.read_csv("train_dataset.csv")

# Segment the dataset in two groups based on the item popularity
segmented_items = InteractionSegmentation().segment(train_data, [0.8, 0.2])

# Read the recommendation lists (in the form user, item, rank)
top_k = pd.read_csv("top_k.csv")

# Merge the recommendation lists with the item groups
top_k_with_item_groups = top_k.merge(segmented_items, on="item")

# Read the test dataset against you would like to evaluate accuracy
# (in the form user, item, rank)
test_data = pd.read_csv("test_dataset.csv")

# Set the target representation of the item groups
target_representation = pd.DataFrame(
    [["1", 0.5], ["2", 0.5]], columns=["group", "target_representation"]
)

# Evaluate the Kullback-Leibler Divergence
divergence = KullbackLeibler().evaluate(top_k_with_item_groups, target_representation)

# Print the results
print(json.dumps({"KL@[0.5, 0.5]": divergence}, indent=4))

Quickstart

Segment your dataset

Evaluate the Accuracy

Evaluate the Beyond Accuracy

Evaluate the Fairness

Quickstart 

Segment your dataset 

Evaluate the Accuracy 

Evaluate the Beyond Accuracy 

Evaluate the Fairness 