The bird songs problem is originally a multi-label, multi-class problem. Each bag is a recording of one or more birds. The bag inherits all the labels of the birds present in the recording. This can be converted to a conventional binary MI problem by choosing a bird “target class”. The bird classes are:

BRCR – Brown Creeper
WIWR – Winter Wren
PSFL – Pacific-slope Flycatcher
RBNU – Red-breasted Nuthatch
DEJU – Dark-eyed Junco
OSFL – Olive-sided Flycatcher
HETH – Hermit Thrush
CBCH – Chestnut-backed Chickadee
VATH – Varied Thrush
HEWA – Hermit Warbler
SWTH – Swainson’s Thrush
HAFL – Hammond’s Flycatcher
WETA – Western Tanager

Each bag (10 second recording) is converted to a spectrogram, and a  segmentation procedure is applied. An instance is represented by a segment of a spectrogram and is described by 38 features (shape of the segment, time and frequency profile statistics, histogram of gradients).


Original source

Thanks to Forrest Briggs for his permission to distribute these datasets.


 title={Rank-loss support instance machines for MIML instance annotation},
 author={Briggs, F. and Fern, X.Z. and Raich, R.},
 booktitle={Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining},


Files – This file contains a MIL dataset x with 13 different label lists. The default label list is for Brown Creeper, but you can switch to a different version of the dataset by doing (use the abbreviations in the list above):

x = changelablist(x,'WIWR');

You need the MIL toolbox to load this version of the dataset correctly. If you do not want to use the toolbox, just load the .MAT file. You can access the data and the label lists by:;

labels=x.nlab;  %Get the k'th label list by labels(:,k)