Annotations on XC!
It has been a plan since forever: annotate individual sounds in Xeno-canto. We are making it possible now.
Recordings on XC are usually attributed to a single species, the "foreground" species. Extra information ("metadata") on sound type, sex etc. can be given.
But many recordings on XC contain more than one type of vocalisation, and quite often by more than one species. The single “label” for the recording (ID+type+sex for example) does not quite cut it then. Moreover this label does not explicitly state where the sound it describes occurs in time and in frequency: it is a "weak" label.
For a limited number of recordings more than one such "weak label" is stored in the database, where the recordist has indicated a "foreground" species and any number of “background species”. (This is one of the features that we copied from Sjoerd Maijer's Birds of Bolivia DVD back in the day.)
The lack of precise temporal and frequency information, and accompanying metadata for sounds within each recording means that a lot of potentially interesting detail in the collection is hidden now.
It is also problematic for all kinds of analysis, particularly of course when multiple species are audible within a recording. In fact, this lack of detailed information on the exact presence in time and frequency of vocalising species in recordings is currently the main obstacle to develop high-throughput automated analysis (e.g. modelling sequences of calls, automated recognition in large sound scape datasets, and so on).
But that situation can change from now on because XC now allows the upload of annotations of individual sounds in recordings. We have added functionality to upload annotations to XC, download annotations from XC (also using the API), search for annotations on XC, and view annotations in a new recording player.
At this moment, we are in a testing phase, and the feature to import annotations is open to a limited group of people that have been involved in its development. If you want to help testing and start annotating recordings on XC, drop us a line. Once we are sure everything works smoothly, we will open up the upload of annotations to everyone.
New sound player
We are not launching all aspects at once. There has already been a very visible change in the last week: a new sound player on XC, replacing the old large spectrogram. The player allows viewing and interacting with the annotations. Instead of static sonograms and oscillograms of the first 10 seconds all recordings will eventually have scrolling spectrograms and oscillograms that show the first 2 minutes of the recording. The annotations can be viewed there, and additional information about each annotation can be found by clicking on one of the annotation boxes. Check out this one, or this one.
Searching for annotations
To find all recordings having annotations, you can use the tag ann:yes. (You can also use ann:no to find recordings without annotations). Annotations can also be searched indirectly, as part of recordings. This allows searches for annotation properties in combination with the properties for the recording.
There are two limitations for annotation searches:
Firstly: only tag-based searches are allowed, (hopefully) preventing slow searches.
Secondly: they cannot be combined with searches for background species (also:). A search for background species automatically includes annotated species. This means you can either search for background species and find both these and annotated species, or just search for annotated species.
The following tags can be used:
- 'ann_ann' (annotator)
- 'ann_frq_high' (highest frequency) . Example ann_frq_high:"<10000"
- 'ann_frq_low' (lowest frequency)
- 'ann_gen' (genus) . Example ann_gen:Grallaria
- 'ann_rmk' (remark) . Example ann_rmk:rain
- 'ann_sex' (sex)
- 'ann_sp' (species) . Example ann_sp:merula
- 'ann_ssp' (subspecies)
- 'ann_stage' (life stage)
- 'ann_type' (sound type)
Preparing annotations
XC itself has not been set up to annotate recordings. Other tools specifically designed for that can be used. XC have communicated with a number of teams to make exchange of sounds and annotations between XC and their tool possible. These are in active development at the moment.
At the moment this tool allows export of metadata in the XC Annota-JSON format:
Spectrolipi (Web, see also this article ) Spectrolipi is already set up to allow quite smooth interaction with XC.
Others are in the works, we'll keep you posted.
Communication between these tools and XC uses a specific JSON format, Annota-JSON that XC has defined. See this article on Annota-JSON. If you have old annotations lying around you can write your own code to convert them to this format. The schema allows a lot of detail, but it only has a few required fields.
Sharing annotations through XC
Just as with the recordings themselves, the annotations can be uploaded as well as downloaded. Annotations are shared with a Creative Commons license associated with them. The choice of licenses is limited because we want them to be able to be shared through GBIF.
Uploading annotations
At this stage we limit uploading to a restricted group of users. If you want to annotate recordings, drop us a line and we'll give you the appropriate rights. Firstly, log in, since you need an XC account for this. Then: use this form to upload an Annota-JSON file. After uploading hit the "Verify" button, to check if the format of your file is ok, and if all the obligatory entries are there. You'll probably be shown some warnings and errors. If all is well, no errors, the file can be uploaded and the annotations are added to your account and are shown in the recording page and below the scrolling sono. If all is not well (errors!) you'll have to mend the file and try again.
An alternative uploading route for developers
XC have set up a dedicated API endpoint to accept annotation sets as JSON data. This will allow developers to fetch recordings from Xeno-canto into their application, let users annotate the recording and directly push the resulting annotation set back to the user's account. Also this feature is available to a restricted set of XC users at the moment.
The general API offers the option to search for specific recordings, but it may be more practical to use the endpoint for individual recordings:
curl "https://xeno-canto.org/api/3/recording/[XC number]?key=[your XC API key]&group_properties"
The additional group_properties parameter will return the group-related properties (under group-sound-properties) for the main species in the recording. This will allow developers to create dynamic selects in their application for the pre-determined values of life stage, recording method, sex and sound type. A reliable way to push the resulting JSON to XC is to cat the JSON (to avoid potential encoding errors) and pipe this to curl:
cat /path/to/annotation_set.json | curl -X POST \
-H "key:[your XC API key]" \
-H "Content-Type: application/json" \
--data-binary @- \
https://xeno-canto.org/api/3/upload/annotation-set
The annotation set will be saved to the user's account, from where it can be verified and imported.
Editing annotations
Annotations cannot be edited. It is possible though to delete a set and re-upload an adapted set of annotations.
Downloading annotations
Recordings with annotations show a Download annotations link, pointing to a Annota-JSON file containing the annotation data.