The Kappa coefficient is a chance-adjusted index of agreement. In machine learning it can be used to quantify the amount of agreement between an algorithm's predictions and some trusted labels of the same objects. Kappa starts with accuracy - the proportion of all objects that both the algorithm and the trusted labels assigned to the same category or class. However, it then attempts to adjust for the probability of the algorithm and trusted labels assigning items to the same category "by chance." This metric typically varies from 0 (random agreement between raters) to 1 (complete agreement between raters). In the event that there is less agreement between the raters than expected by chance, the metric may go below 0.

The quadratic weighted kappa is calculated as follows.

- First, an N x N histogram matrix O is constructed, such that Oi,j corresponds to the number of adoption records that have a rating of i (actual) and received a predicted rating j.
- An N-by-N matrix of weights, w, is calculated based on the difference between actual and predicted rating scores.
- An N-by-N histogram matrix of expected ratings, E, is calculated, assuming that there is no correlation between rating scores. This is calculated as the outer product between the actual rating's histogram vector of ratings and the predicted rating's histogram vector of ratings, normalized such that E and O have the same sum.
- Weighted Kappa = 1 - sum(w_i,j O_i,j) / (w_i,j E_i,j)

In [23]:

```
import numpy as np
actuals = np.array([4, 4, 3, 4, 4, 0, 1, 1, 2, 1])
preds = np.array([0, 4, 1, 0, 4, 0, 1, 1, 2, 1])
```

In [24]:

```
from sklearn.metrics import confusion_matrix
O = confusion_matrix(actuals, preds);
print('Matrix O:')
print(O)
```

In [25]:

```
w = np.zeros((5,5))
for i in range(len(w)):
for j in range(len(w)):
w[i][j] = float(((i-j)**2)/16)
print('weights matrix:')
print(w)
```

In [26]:

```
N=5
act_hist=np.zeros([N])
for item in actuals:
act_hist[item]+=1
pred_hist=np.zeros([N])
for item in preds:
pred_hist[item]+=1
E = np.outer(act_hist, pred_hist)
print('Expected matrix:')
print(E)
```

In [27]:

```
E = E/E.sum() # normalize E
O = O/O.sum() # normalize O
num=0
den=0
for i in range(len(w)):
for j in range(len(w)):
num+=w[i][j]*O[i][j]
den+=w[i][j]*E[i][j]
weighted_kappa = (1 - (num/den))
print('weighted kappa:')
weighted_kappa
```

Out[27]:

In [67]:

```
from sklearn.metrics import cohen_kappa_score, confusion_matrix
import numpy as np
from time import time
#dataset
np.random.seed(2020)
actuals = np.random.randint(0, 4, 10000)
preds = np.random.randint(0, 4, 10000)
```

In [68]:

```
# QWK
start_time = time()
qwk = cohen_kappa_score(actuals, preds, weights="quadratic")
print(f'qwk = {qwk} (runtime: {time()-start_time:0.3} seconds)')
```

In [64]:

```
# https://www.kaggle.com/afajohn/quadratic-weighted-kappa-with-numpy-flavor
def quadKappa(act,pred,n=4,hist_range=(0,3)):
O = confusion_matrix(act,pred)
O = np.divide(O,np.sum(O))
W = np.zeros((n,n))
for i in range(n):
for j in range(n):
W[i][j] = ((i-j)**2)/((n-1)**2)
act_hist = np.histogram(act,bins=n,range=hist_range)[0]
prd_hist = np.histogram(pred,bins=n,range=hist_range)[0]
E = np.outer(act_hist,prd_hist)
E = np.divide(E,np.sum(E))
num = np.sum(np.multiply(W,O))
den = np.sum(np.multiply(W,E))
return 1-np.divide(num,den)
```

In [66]:

```
# QWK
start_time = time()
qwk = quadKappa(actuals, preds)
print(f'qwk = {qwk} (runtime: {time()-start_time:0.3} seconds)')
```

In [63]:

```
#https://www.kaggle.com/c/data-science-bowl-2019/discussion/114133
from numba import jit
import warnings
warnings.filterwarnings('ignore')
@jit
def qwk3(a1, a2, max_rat=3):
assert(len(a1) == len(a2))
a1 = np.asarray(a1, dtype=int)
a2 = np.asarray(a2, dtype=int)
hist1 = np.zeros((max_rat + 1, ))
hist2 = np.zeros((max_rat + 1, ))
o = 0
for k in range(a1.shape[0]):
i, j = a1[k], a2[k]
hist1[i] += 1
hist2[j] += 1
o += (i - j) * (i - j)
e = 0
for i in range(max_rat + 1):
for j in range(max_rat + 1):
e += hist1[i] * hist2[j] * (i - j) * (i - j)
e = e / a1.shape[0]
return 1 - o / e
```

In [62]:

```
# QWK
start_time = time()
qwk = qwk3(actuals, preds)
print(f'qwk = {qwk} (runtime: {time()-start_time:0.3} seconds)')
```